For tasks involving mathematical calculations or data analytics, add code_interpreter alongside web extractor to improve accuracy.
Getting started
Call web extractor through the Responses API to summarize a web page. The examples below use web_search and web_extractor with qwen3-max-2026-01-23 in thinking mode.
import os
from openai import OpenAI
client = OpenAI(
# If the environment variable is not configured, replace with: api_key="sk-xxx"
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
)
response = client.responses.create(
model="qwen3-max-2026-01-23",
input="Please visit the official Qwen Cloud documentation, find the code interpreter topic and summarize it",
tools=[
{"type": "web_search"},
{"type": "web_extractor"}
],
extra_body={
"enable_thinking": True
}
)
# Uncomment to view intermediate output
# print(response.output)
print("=" * 20 + "Response" + "=" * 20)
print(response.output_text)
# Print tool invocation count
usage = response.usage
print("=" * 20 + "Tool Invocation Count" + "=" * 20)
if hasattr(usage, 'x_tools') and usage.x_tools:
print(f"Web Extractor invocations: {usage.x_tools.get('web_extractor', {}).get('count', 0)}")
import OpenAI from "openai";
import process from 'process';
const openai = new OpenAI({
// If the environment variable is not configured, replace with: apiKey: "sk-xxx"
apiKey: process.env.DASHSCOPE_API_KEY,
baseURL: "https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
});
async function main() {
const response = await openai.responses.create({
model: "qwen3-max-2026-01-23",
input: "Please visit the official Qwen Cloud documentation, find the code interpreter topic and summarize it",
tools: [
{ type: "web_search" },
{ type: "web_extractor" }
],
enable_thinking: true
});
console.log("====================Response====================");
console.log(response.output_text);
// Print tool invocation count
console.log("====================Tool Invocation Count====================");
if (response.usage && response.usage.x_tools) {
console.log(`Web Extractor invocations: ${response.usage.x_tools.web_extractor?.count || 0}`);
console.log(`Web Search invocations: ${response.usage.x_tools.web_search?.count || 0}`);
}
// Uncomment to view intermediate output
// console.log(JSON.stringify(response.output[0], null, 2));
}
main();
curl -X POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-max-2026-01-23",
"input": "Please visit the official Qwen Cloud documentation, find the code interpreter topic and summarize it",
"tools": [
{"type": "web_search"},
{"type": "web_extractor"}
],
"enable_thinking": true
}'
Response structure
The response contains the model's generated text plus metadata about tool usage.
| Field | Description |
|---|
output_text | The model's final text response, grounded in the extracted web content |
output[] | Array of intermediate items including web_extractor_call objects (each with goal and output fields showing what URL was fetched and the extracted content) |
usage.x_tools.web_extractor.count | Number of web extractor invocations in this request |
usage.x_tools.web_search.count | Number of web search invocations in this request |
How it works
- Include
web_extractor (and typically web_search) in the tools array of your API request, along with a prompt that references a URL or topic.
- The model determines which pages to fetch, retrieves their content, and appends it to the context as additional input tokens.
- The model generates a response grounded in the retrieved content.
Extracted web content increases input tokens and affects billing. See Billing for details.
| Scenario | Tool configuration | Why |
|---|
| Answer questions about a specific URL | web_extractor (optionally with web_search) | The model fetches and reads the full page content, not just a search snippet |
| Research a topic across the web | web_search + web_extractor | web_search finds relevant pages; web_extractor reads them in full |
| Quick factual lookup (no specific URL) | web_search alone | Search snippets are often sufficient; cheaper and faster |
Use web_extractor when you need the model to read the actual content of a page -- not just a search-result summary.
Invocation methods
Web extractor supports three APIs. The Responses API provides the most control over tool behavior -- use it for new integrations.
| API | Tool configuration | Streaming required | Notes |
|---|
| Responses API (recommended) | Add web_search and web_extractor to tools | No | Exposes intermediate tool execution status |
| Chat Completions API | Set enable_search: true, search_strategy: "agent_max" | Yes | Non-streaming not supported |
| DashScope API | Set enable_search: true, search_strategy: "agent_max" | Yes | Java SDK not supported |
When using qwen3-max-2026-01-23, set enable_thinking to true.
response = client.responses.create(
model="qwen3-max-2026-01-23",
input="Summarize the content at https://example.com/article",
tools=[
{"type": "web_search"},
{"type": "web_extractor"}
],
extra_body={
"enable_thinking": True
}
)
completion = client.chat.completions.create(
model="qwen3-max-2026-01-23",
messages=[{"role": "user", "content": "Summarize the content at https://example.com/article"}],
extra_body={
"enable_thinking": True,
"enable_search": True,
"search_options": {"search_strategy": "agent_max"}
},
stream=True
)
from dashscope import Generation
response = Generation.call(
model="qwen3-max-2026-01-23",
messages=[{"role": "user", "content": "Summarize the content at https://example.com/article"}],
enable_search=True,
search_options={"search_strategy": "agent_max"},
enable_thinking=True,
result_format="message",
stream=True,
incremental_output=True
)
For general streaming concepts (SSE protocol, how to enable streaming, and token usage), see Streaming output. This section covers only the event types specific to web extractor.
Web extraction can take time. Use streaming to receive reasoning steps, tool calls, and response text as they happen. The Responses API exposes intermediate execution status for each tool call, making it the best choice for streaming.
When streaming with the Responses API, the following event types indicate progress through the extraction and generation pipeline:
| Event type | Description |
|---|
response.reasoning_summary_text.delta | Incremental reasoning text from the model's thinking process |
response.output_item.done | A tool call has completed. Check item.type for web_extractor_call to get the extraction result |
response.output_text.delta | Incremental response text |
response.completed | The response is complete. The usage field contains tool invocation counts |
To handle the web_extractor_call event in a stream, check for the event type and read its goal and output fields:
for chunk in stream:
if chunk.type == 'response.output_item.done':
if hasattr(chunk, 'item') and chunk.item.type == 'web_extractor_call':
print(f"Fetched: {chunk.item.goal}")
print(f"Content: {chunk.item.output}")
elif chunk.type == 'response.output_text.delta':
print(chunk.delta, end='', flush=True)
elif chunk.type == 'response.completed':
usage = chunk.response.usage
if hasattr(usage, 'x_tools') and usage.x_tools:
print(f"Web Extractor invocations: {usage.x_tools.get('web_extractor', {}).get('count', 0)}")
Supported models
| Model family | Model IDs |
|---|
| Qwen-Max | qwen3-max, qwen3-max-2026-01-23 (thinking mode) |
| Qwen-Plus | qwen3.6-plus, qwen3.6-plus-2026-04-02, qwen3.5-plus, qwen3.5-plus-2026-02-15 |
| Qwen-Flash | qwen3.5-flash, qwen3.5-flash-2026-02-23 |
| Open-source Qwen | qwen3.5-397b-a17b, qwen3.5-122b-a10b, qwen3.5-27b, qwen3.5-35b-a3b |
Limitations
- Web extractor retrieves publicly accessible pages only. Pages behind authentication or paywalls return empty content.
- Very large pages may be truncated before being added to the context window.
- Dynamic content rendered exclusively by JavaScript may not be fully captured.
- The extracted content counts as input tokens, which increases both latency and cost for large pages.
Error handling
When extraction fails, the model does not raise an error. Instead, the web_extractor_call item in the response output returns empty or partial content, and the model generates its response based on whatever context is available.
Common failure scenarios:
| Scenario | Behavior |
|---|
| URL is unreachable (404, 500, DNS failure) | Extraction returns empty content; model responds using other available context |
| Page load times out | Partial or empty content returned |
| Non-HTML content (PDF, images) | Content may not be extracted; model falls back to other tools or general knowledge |
To verify whether extraction succeeded, inspect the web_extractor_call items in response.output or check usage.x_tools.web_extractor.count for the number of successful invocations.
Billing
Web extractor costs have two components:
| Component | Details |
|---|
| Model cost | Extracted web content is appended to the prompt, which increases input tokens. Billed at the model's standard token price. See Pricing for pricing. |
| Web search fee | $10.00 per 1,000 search invocations. Web extractor typically triggers web search internally. |
| Web extractor fee | Free for a limited time. |