Connect to MCP servers

The Model Context Protocol (MCP) enables large language models to use external tools and data. Compared with function calling, MCP offers greater flexibility and ease of use. This topic describes how to connect to MCP using the Responses API. Add MCP server information in the tools parameter when using the Responses API.

Supports MCP servers using the SSE protocol. Maximum 10 MCP servers per request.

Getting started

This example uses the Fetch web scraping MCP service from ModelScope. You can get the SSE Endpoint and authentication information for the service from the Service configuration section on the right. Get an API key and configure it as an environment variable.

Replace server_url with the SSE endpoint from the MCP service platform. Replace the authentication in headers with the token provided by that platform.

Python
Node.js
curl

import os
from openai import OpenAI

client = OpenAI(
  # If no environment variable, use: api_key="sk-xxx" (not recommended).
  api_key=os.getenv("DASHSCOPE_API_KEY"),
  base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
)

# MCP tool configuration
# Replace server_url with the SSE Endpoint that you got from a platform such as ModelScope
# If authentication is required, add the token from the corresponding platform to the headers
mcp_tool = {
  "type": "mcp",
  "server_protocol": "sse",
  "server_label": "fetch",
  "server_description": "Fetch MCP Server that provides web scraping capabilities. It can scrape the content of a specified URL and return it as text.",
  "server_url": "https://mcp.api-inference.modelscope.net/xxx/sse",
}

response = client.responses.create(
  model="qwen3.7-plus",
  input="https://news.aibase.com/zh/news, what is the AI news today?",
  tools=[mcp_tool]
)

print("[Model Response]")
print(response.output_text)
print(f"\n[Token Usage] Input: {response.usage.input_tokens}, Output: {response.usage.output_tokens}, Total: {response.usage.total_tokens}")

import OpenAI from "openai";
import process from 'process';

const openai = new OpenAI({
  // If no environment variable, use: apiKey: "sk-xxx" (not recommended).
  apiKey: process.env.DASHSCOPE_API_KEY,
  baseURL: "https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
});

async function main() {
  // MCP tool configuration
  // Replace server_url with the SSE Endpoint that you got from a platform such as ModelScope
  // If authentication is required, add the token from the corresponding platform to the headers
  const mcpTool = {
    type: "mcp",
    server_protocol: "sse",
    server_label: "fetch",
    server_description: "Fetch MCP Server that provides web scraping capabilities. It can scrape the content of a specified URL and return it as text.",
    server_url: "https://mcp.api-inference.modelscope.net/xxx/sse",
  };

  const response = await openai.responses.create({
    model: "qwen3.7-plus",
    input: "https://news.aibase.com/zh/news, what is the AI news today?",
    tools: [mcpTool]
  });

  console.log("[Model Response]");
  console.log(response.output_text);
  console.log(`\n[Token Usage] Input: ${response.usage.input_tokens}, Output: ${response.usage.output_tokens}, Total: ${response.usage.total_tokens}`);
}

main();

# Replace server_url with the SSE Endpoint that you got from a platform such as ModelScope
# If authentication is required, add the token from the corresponding platform to the headers
curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
  "model": "qwen3.7-plus",
  "input": "https://news.aibase.com/zh/news, what is the AI news today?",
  "tools": [
    {
      "type": "mcp",
      "server_protocol": "sse",
      "server_label": "fetch",
      "server_description": "Fetch MCP Server that provides web scraping capabilities. It can scrape the content of a specified URL and return it as text.",
      "server_url": "https://mcp.api-inference.modelscope.net/xxx/sse"
    }
  ]
}'

After you run the code, the following response is returned:

[Model Response]
Based on the documentation for the Model Context Protocol (MCP) on the Qwen Cloud help center, the supported models are:

*   Qwen Max:
    *   Qwen3.7-Max series
*   Qwen Plus:
    *   Qwen3.6-Plus series
    *   Qwen3.5-Plus series
*   Qwen Flash:
    *   Qwen3.6-Flash series
    *   Qwen3.5-Flash series
*   Qwen3.6 Open Source Series (excluding qwen3.6-27b)
*   Qwen3.5 Open Source Series

Note: The documentation specifies that MCP is only supported via the Responses API (client.responses.create) and not the standard Chat Completions API.

[Token Usage] Input: 20583, Output: 1638, Total: 22221

Streaming output

MCP tool calls may involve multiple interactions with external services. Enable streaming for real-time intermediate results.

Python
Node.js
curl

import os
from openai import OpenAI

client = OpenAI(
  # If no environment variable, use: api_key="sk-xxx" (not recommended).
  api_key=os.getenv("DASHSCOPE_API_KEY"),
  base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
)

# Replace server_url with the SSE Endpoint that you got from a platform such as ModelScope
# If authentication is required, add the token from the corresponding platform to the headers
mcp_tool = {
  "type": "mcp",
  "server_protocol": "sse",
  "server_label": "fetch",
  "server_description": "Fetch MCP Server that provides web scraping capabilities. It can scrape the content of a specified URL and return it as text.",
  "server_url": "https://mcp.api-inference.modelscope.net/xxx/sse",
}

stream = client.responses.create(
  model="qwen3.7-plus",
  input="https://news.aibase.com/zh/news, what is the AI news today?",
  tools=[mcp_tool],
  stream=True
)

for event in stream:
  # The model response starts
  if event.type == "response.content_part.added":
    print("[Model Response]")
  # Streaming text output
  elif event.type == "response.output_text.delta":
    print(event.delta, end="", flush=True)
  # The response is complete, output the usage
  elif event.type == "response.completed":
    usage = event.response.usage
    print(f"\n\n[Token Usage] Input: {usage.input_tokens}, Output: {usage.output_tokens}, Total: {usage.total_tokens}")

import OpenAI from "openai";
import process from 'process';

const openai = new OpenAI({
  // If no environment variable, use: apiKey: "sk-xxx" (not recommended).
  apiKey: process.env.DASHSCOPE_API_KEY,
  baseURL: "https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
});

async function main() {
  // Replace server_url with the SSE Endpoint that you got from a platform such as ModelScope
  // If authentication is required, add the token from the corresponding platform to the headers
  const mcpTool = {
    type: "mcp",
    server_protocol: "sse",
    server_label: "fetch",
    server_description: "Fetch MCP Server that provides web scraping capabilities. It can scrape the content of a specified URL and return it as text.",
    server_url: "https://mcp.api-inference.modelscope.net/xxx/sse",
  };

  const stream = await openai.responses.create({
    model: "qwen3.7-plus",
    input: "https://news.aibase.com/zh/news, what is the AI news today?",
    tools: [mcpTool],
    stream: true
  });

  for await (const event of stream) {
    // The model response starts
    if (event.type === "response.content_part.added") {
      console.log("[Model Response]");
    }
    // Streaming text output
    else if (event.type === "response.output_text.delta") {
      process.stdout.write(event.delta);
    }
    // The response is complete, output the usage
    else if (event.type === "response.completed") {
      const usage = event.response.usage;
      console.log(`\n\n[Token Usage] Input: ${usage.input_tokens}, Output: ${usage.output_tokens}, Total: ${usage.total_tokens}`);
    }
  }
}

main();

# Replace server_url with the SSE Endpoint that you got from a platform such as ModelScope
# If authentication is required, add the token from the corresponding platform to the headers
curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
  "model": "qwen3.7-plus",
  "input": "https://news.aibase.com/zh/news, what is the AI news today?",
  "tools": [
    {
      "type": "mcp",
      "server_protocol": "sse",
      "server_label": "fetch",
      "server_description": "Fetch MCP Server that provides web scraping capabilities. It can scrape the content of a specified URL and return it as text.",
      "server_url": "https://mcp.api-inference.modelscope.net/xxx/sse"
    }
  ],
  "stream": true
}'

After you run the code, the following response is returned:

[Model Response]
Based on the documentation page for MCP on Qwen Cloud, the following models are supported:

*   Qwen Max Series: Qwen3.7-Max series
*   Qwen Plus Series: Qwen3.6-Plus series and Qwen3.5-Plus series
*   Qwen Flash Series: Qwen3.6-Flash series and Qwen3.5-Flash series
*   Qwen3.6 Open Source Series (excluding qwen3.6-27b)
*   Qwen3.5 Open Source Series

Note: These models support MCP functionality only via the Responses API.

[Token Usage] Input: 20472, Output: 945, Total: 21417

Parameters

The mcp tool supports the following parameters:

Parameter	Required	Description
`type`	Yes	Set to `"mcp"`.
`server_protocol`	Yes	The communication protocol with the MCP server. Currently, only `"sse"` is supported.
`server_label`	Yes	The label name of the MCP server, used to identify the service.
`server_description`	No	A description of the MCP server's features. This helps the model understand the service's capabilities and scenarios. Filling in this parameter is recommended to improve the accuracy of model calls.
`server_url`	Yes	The endpoint URL of the MCP server.
`headers`	No	The request headers to include when connecting to the MCP server, such as authentication information like `Authorization`.

Example:

{
  "type": "mcp",
  "server_protocol": "sse",
  "server_label": "fetch",
  "server_description": "Fetch MCP Server that provides web scraping capabilities. It can scrape the content of a specified URL and return it as text.",
  "server_url": "https://mcp.api-inference.modelscope.net/xxx/sse"
}

Supported models

MCP is available through the Responses API only.

Qwen-Max: Qwen3.7-Max series
Qwen-Plus: Qwen3.6-Plus series, Qwen3.5-Plus series
Qwen-Flash: Qwen3.6-Flash series, Qwen3.5-Flash series
Qwen3.6 open-source series (excluding qwen3.6-27b)
Qwen3.5 open-source series

Billing

Two types of charges apply when using MCP:

Model inference fees: Billed based on the model's token usage.
MCP server fees: Subject to the billing rules of each MCP server provider.

​Getting started

​Streaming output

​Parameters

​Supported models

​Billing

Getting started

Streaming output

Parameters

Supported models

Billing