Skip to main content
Tool calling

Connect to MCP servers

Use external tools in chat

The Model Context Protocol (MCP) enables large language models to use external tools and data. Compared with function calling, MCP offers greater flexibility and ease of use. This topic describes how to connect to MCP using the Responses API. Add MCP server information in the tools parameter when using the Responses API.
Supports MCP servers using the SSE protocol. Maximum 10 MCP servers per request.

Getting started

This example uses the Fetch web scraping MCP service from ModelScope. You can get the SSE Endpoint and authentication information for the service from the Service configuration section on the right. Get an API key and configure it as an environment variable.
Replace server_url with the SSE endpoint from the MCP service platform. Replace the authentication in headers with the token provided by that platform.
  • Python
  • Node.js
  • curl
import os
from openai import OpenAI

client = OpenAI(
  # If no environment variable, use: api_key="sk-xxx" (not recommended).
  api_key=os.getenv("DASHSCOPE_API_KEY"),
  base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
)

# MCP tool configuration
# Replace server_url with the SSE Endpoint that you got from a platform such as ModelScope
# If authentication is required, add the token from the corresponding platform to the headers
mcp_tool = {
  "type": "mcp",
  "server_protocol": "sse",
  "server_label": "fetch",
  "server_description": "Fetch MCP Server that provides web scraping capabilities. It can scrape the content of a specified URL and return it as text.",
  "server_url": "https://mcp.api-inference.modelscope.net/xxx/sse",
}

response = client.responses.create(
  model="qwen3.7-plus",
  input="https://news.aibase.com/zh/news, what is the AI news today?",
  tools=[mcp_tool]
)

print("[Model Response]")
print(response.output_text)
print(f"\n[Token Usage] Input: {response.usage.input_tokens}, Output: {response.usage.output_tokens}, Total: {response.usage.total_tokens}")
After you run the code, the following response is returned:
[Model Response]
Based on the documentation for the Model Context Protocol (MCP) on the Qwen Cloud help center, the supported models are:

*   Qwen Max:
    *   Qwen3.7-Max series
*   Qwen Plus:
    *   Qwen3.6-Plus series
    *   Qwen3.5-Plus series
*   Qwen Flash:
    *   Qwen3.6-Flash series
    *   Qwen3.5-Flash series
*   Qwen3.6 Open Source Series (excluding qwen3.6-27b)
*   Qwen3.5 Open Source Series

Note: The documentation specifies that MCP is only supported via the Responses API (client.responses.create) and not the standard Chat Completions API.

[Token Usage] Input: 20583, Output: 1638, Total: 22221

Streaming output

MCP tool calls may involve multiple interactions with external services. Enable streaming for real-time intermediate results.
  • Python
  • Node.js
  • curl
import os
from openai import OpenAI

client = OpenAI(
  # If no environment variable, use: api_key="sk-xxx" (not recommended).
  api_key=os.getenv("DASHSCOPE_API_KEY"),
  base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
)

# Replace server_url with the SSE Endpoint that you got from a platform such as ModelScope
# If authentication is required, add the token from the corresponding platform to the headers
mcp_tool = {
  "type": "mcp",
  "server_protocol": "sse",
  "server_label": "fetch",
  "server_description": "Fetch MCP Server that provides web scraping capabilities. It can scrape the content of a specified URL and return it as text.",
  "server_url": "https://mcp.api-inference.modelscope.net/xxx/sse",
}

stream = client.responses.create(
  model="qwen3.7-plus",
  input="https://news.aibase.com/zh/news, what is the AI news today?",
  tools=[mcp_tool],
  stream=True
)

for event in stream:
  # The model response starts
  if event.type == "response.content_part.added":
    print("[Model Response]")
  # Streaming text output
  elif event.type == "response.output_text.delta":
    print(event.delta, end="", flush=True)
  # The response is complete, output the usage
  elif event.type == "response.completed":
    usage = event.response.usage
    print(f"\n\n[Token Usage] Input: {usage.input_tokens}, Output: {usage.output_tokens}, Total: {usage.total_tokens}")
After you run the code, the following response is returned:
[Model Response]
Based on the documentation page for MCP on Qwen Cloud, the following models are supported:

*   Qwen Max Series: Qwen3.7-Max series
*   Qwen Plus Series: Qwen3.6-Plus series and Qwen3.5-Plus series
*   Qwen Flash Series: Qwen3.6-Flash series and Qwen3.5-Flash series
*   Qwen3.6 Open Source Series (excluding qwen3.6-27b)
*   Qwen3.5 Open Source Series

Note: These models support MCP functionality only via the Responses API.

[Token Usage] Input: 20472, Output: 945, Total: 21417

Parameters

The mcp tool supports the following parameters:
ParameterRequiredDescription
typeYesSet to "mcp".
server_protocolYesThe communication protocol with the MCP server. Currently, only "sse" is supported.
server_labelYesThe label name of the MCP server, used to identify the service.
server_descriptionNoA description of the MCP server's features. This helps the model understand the service's capabilities and scenarios. Filling in this parameter is recommended to improve the accuracy of model calls.
server_urlYesThe endpoint URL of the MCP server.
headersNoThe request headers to include when connecting to the MCP server, such as authentication information like Authorization.
Example:
{
  "type": "mcp",
  "server_protocol": "sse",
  "server_label": "fetch",
  "server_description": "Fetch MCP Server that provides web scraping capabilities. It can scrape the content of a specified URL and return it as text.",
  "server_url": "https://mcp.api-inference.modelscope.net/xxx/sse"
}

Supported models

MCP is available through the Responses API only.
  • Qwen-Max: Qwen3.7-Max series
  • Qwen-Plus: Qwen3.6-Plus series, Qwen3.5-Plus series
  • Qwen-Flash: Qwen3.6-Flash series, Qwen3.5-Flash series
  • Qwen3.6 open-source series (excluding qwen3.6-27b)
  • Qwen3.5 open-source series

Billing

Two types of charges apply when using MCP:
  • Model inference fees: Billed based on the model's token usage.
  • MCP server fees: Subject to the billing rules of each MCP server provider.