Web extractor - Qwen Cloud

For tasks involving mathematical calculations or data analytics, add code_interpreter alongside web extractor to improve accuracy.

Getting started

Call web extractor through the Responses API to summarize a web page. The examples below use web_search and web_extractor with qwen3-max-2026-01-23 in thinking mode.

Python
Node.js
curl

import os
from openai import OpenAI

client = OpenAI(
  # If the environment variable is not configured, replace with: api_key="sk-xxx"
  api_key=os.getenv("DASHSCOPE_API_KEY"),
  base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
)

response = client.responses.create(
  model="qwen3-max-2026-01-23",
  input="Please visit the official Qwen Cloud documentation, find the code interpreter topic and summarize it",
  tools=[
    {"type": "web_search"},
    {"type": "web_extractor"}
  ],
  extra_body={
    "enable_thinking": True
  }
)

# Uncomment to view intermediate output
# print(response.output)
print("=" * 20 + "Response" + "=" * 20)
print(response.output_text)

# Print tool invocation count
usage = response.usage
print("=" * 20 + "Tool Invocation Count" + "=" * 20)
if hasattr(usage, 'x_tools') and usage.x_tools:
  print(f"Web Extractor invocations: {usage.x_tools.get('web_extractor', {}).get('count', 0)}")

import OpenAI from "openai";
import process from 'process';

const openai = new OpenAI({
  // If the environment variable is not configured, replace with: apiKey: "sk-xxx"
  apiKey: process.env.DASHSCOPE_API_KEY,
  baseURL: "https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
});

async function main() {
  const response = await openai.responses.create({
    model: "qwen3-max-2026-01-23",
    input: "Please visit the official Qwen Cloud documentation, find the code interpreter topic and summarize it",
    tools: [
      { type: "web_search" },
      { type: "web_extractor" }
    ],
    enable_thinking: true
  });

  console.log("====================Response====================");
  console.log(response.output_text);

  // Print tool invocation count
  console.log("====================Tool Invocation Count====================");
  if (response.usage && response.usage.x_tools) {
    console.log(`Web Extractor invocations: ${response.usage.x_tools.web_extractor?.count || 0}`);
    console.log(`Web Search invocations: ${response.usage.x_tools.web_search?.count || 0}`);
  }
  // Uncomment to view intermediate output
  // console.log(JSON.stringify(response.output[0], null, 2));
}

main();

curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
  "model": "qwen3-max-2026-01-23",
  "input": "Please visit the official Qwen Cloud documentation, find the code interpreter topic and summarize it",
  "tools": [
    {"type": "web_search"},
    {"type": "web_extractor"}
  ],
  "enable_thinking": true
}'

Response structure

The response contains the model's generated text plus metadata about tool usage.

Field	Description
`output_text`	The model's final text response, grounded in the extracted web content
`output[]`	Array of intermediate items including `web_extractor_call` objects (each with `goal` and `output` fields showing what URL was fetched and the extracted content)
`usage.x_tools.web_extractor.count`	Number of web extractor invocations in this request
`usage.x_tools.web_search.count`	Number of web search invocations in this request

How it works

Include web_extractor (and typically web_search) in the tools array of your API request, along with a prompt that references a URL or topic.
The model determines which pages to fetch, retrieves their content, and appends it to the context as additional input tokens.
The model generates a response grounded in the retrieved content.

Extracted web content increases input tokens and affects billing. See Billing for details.

When to use web extractor

Scenario	Tool configuration	Why
Answer questions about a specific URL	`web_extractor` (optionally with `web_search`)	The model fetches and reads the full page content, not just a search snippet
Research a topic across the web	`web_search` + `web_extractor`	`web_search` finds relevant pages; `web_extractor` reads them in full
Quick factual lookup (no specific URL)	`web_search` alone	Search snippets are often sufficient; cheaper and faster

Use web_extractor when you need the model to read the actual content of a page -- not just a search-result summary.

Invocation methods

Web extractor supports three APIs. The Responses API provides the most control over tool behavior -- use it for new integrations.

API	Tool configuration	Streaming required	Notes
Responses API (recommended)	Add `web_search` and `web_extractor` to `tools`	No	Exposes intermediate tool execution status
Chat Completions API	Set `enable_search: true`, `search_strategy: "agent_max"`	Yes	Non-streaming not supported
DashScope API	Set `enable_search: true`, `search_strategy: "agent_max"`	Yes	Java SDK not supported

When using qwen3-max-2026-01-23, set enable_thinking to true.

Responses API
Chat Completions API
DashScope API

response = client.responses.create(
  model="qwen3-max-2026-01-23",
  input="Summarize the content at https://example.com/article",
  tools=[
    {"type": "web_search"},
    {"type": "web_extractor"}
  ],
  extra_body={
    "enable_thinking": True
  }
)

completion = client.chat.completions.create(
  model="qwen3-max-2026-01-23",
  messages=[{"role": "user", "content": "Summarize the content at https://example.com/article"}],
  extra_body={
    "enable_thinking": True,
    "enable_search": True,
    "search_options": {"search_strategy": "agent_max"}
  },
  stream=True
)

from dashscope import Generation

response = Generation.call(
  model="qwen3-max-2026-01-23",
  messages=[{"role": "user", "content": "Summarize the content at https://example.com/article"}],
  enable_search=True,
  search_options={"search_strategy": "agent_max"},
  enable_thinking=True,
  result_format="message",
  stream=True,
  incremental_output=True
)

Stream web extractor events

For general streaming concepts (SSE protocol, how to enable streaming, and token usage), see Streaming output. This section covers only the event types specific to web extractor.

Web extraction can take time. Use streaming to receive reasoning steps, tool calls, and response text as they happen. The Responses API exposes intermediate execution status for each tool call, making it the best choice for streaming. When streaming with the Responses API, the following event types indicate progress through the extraction and generation pipeline:

Event type	Description
`response.reasoning_summary_text.delta`	Incremental reasoning text from the model's thinking process
`response.output_item.done`	A tool call has completed. Check `item.type` for `web_extractor_call` to get the extraction result
`response.output_text.delta`	Incremental response text
`response.completed`	The response is complete. The `usage` field contains tool invocation counts

To handle the web_extractor_call event in a stream, check for the event type and read its goal and output fields:

for chunk in stream:
  if chunk.type == 'response.output_item.done':
    if hasattr(chunk, 'item') and chunk.item.type == 'web_extractor_call':
      print(f"Fetched: {chunk.item.goal}")
      print(f"Content: {chunk.item.output}")
  elif chunk.type == 'response.output_text.delta':
    print(chunk.delta, end='', flush=True)
  elif chunk.type == 'response.completed':
    usage = chunk.response.usage
    if hasattr(usage, 'x_tools') and usage.x_tools:
      print(f"Web Extractor invocations: {usage.x_tools.get('web_extractor', {}).get('count', 0)}")

Supported models

Model family	Model IDs
Qwen-Max	`qwen3.7-max`, `qwen3.7-max-2026-06-08`, `qwen3.7-max-2026-05-20`, `qwen3-max`, `qwen3-max-2026-01-23` (thinking mode)
Qwen-Plus	`qwen3.7-plus`, `qwen3.6-plus-2026-04-02`, `qwen3.5-plus`, `qwen3.5-plus-2026-04-20`, `qwen3.5-plus-2026-02-15`
Qwen-Flash	`qwen3.5-flash`, `qwen3.5-flash-2026-02-23`
Open-source Qwen	`qwen3.6-35b-a3b`, `qwen3.5-397b-a17b`, `qwen3.5-122b-a10b`, `qwen3.5-27b`, `qwen3.5-35b-a3b`

Limitations

Web extractor retrieves publicly accessible pages only. Pages behind authentication or paywalls return empty content.
Very large pages may be truncated before being added to the context window.
Dynamic content rendered exclusively by JavaScript may not be fully captured.
The extracted content counts as input tokens, which increases both latency and cost for large pages.

Error handling

When extraction fails, the model does not raise an error. Instead, the web_extractor_call item in the response output returns empty or partial content, and the model generates its response based on whatever context is available. Common failure scenarios:

Scenario	Behavior
URL is unreachable (404, 500, DNS failure)	Extraction returns empty content; model responds using other available context
Page load times out	Partial or empty content returned
Non-HTML content (PDF, images)	Content may not be extracted; model falls back to other tools or general knowledge

To verify whether extraction succeeded, inspect the web_extractor_call items in response.output or check usage.x_tools.web_extractor.count for the number of successful invocations.

Billing

Web extractor costs have two components:

The prices listed below are list prices. For current promotions and discounted pricing, visit the Model Marketplace.

Component	Details
Model cost	Extracted web content is appended to the prompt, which increases input tokens. Billed at the model's standard token price. See Pricing for pricing.
Web search fee	$10.00 per 1,000 search invocations. Web extractor typically triggers web search internally.
Web extractor fee	Free for a limited time.

​Getting started

​Response structure

​How it works

​When to use web extractor

​Invocation methods

​Stream web extractor events

​Supported models

​Limitations

​Error handling

​Billing

Getting started

Response structure

How it works

When to use web extractor

Invocation methods

Stream web extractor events

Supported models

Limitations

Error handling

Billing