OpenAI compatibility

Qwen Cloud provides OpenAI-compatible APIs. If you have existing code that uses the OpenAI SDK or REST API, you can switch to Qwen models by changing three parameters: base_url, api_key, and model.

Quick migration

Python
Node.js
curl

import os
from openai import OpenAI

client = OpenAI(
  api_key=os.getenv("DASHSCOPE_API_KEY"),
  base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)

completion = client.chat.completions.create(
  model="qwen3.7-plus",
  messages=[{"role": "user", "content": "Hello!"}],
)
print(completion.choices[0].message.content)

import OpenAI from "openai";

const openai = new OpenAI({
  apiKey: process.env.DASHSCOPE_API_KEY,
  baseURL: "https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
});

async function main() {
  const completion = await openai.chat.completions.create({
    model: "qwen3.7-plus",
    messages: [{ role: "user", content: "Hello!" }],
  });
  console.log(completion.choices[0].message.content);
}

main();

curl https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions \
  -H "Authorization: Bearer $DASHSCOPE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3.7-plus",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Before you begin, get an API key and set it as an environment variable. If you use the OpenAI SDK, install it.

Supported APIs

API	Base URL (for SDK)	Description
Chat Completions	`https://dashscope-intl.aliyuncs.com/compatible-mode/v1`	Text generation, vision, function calling
Responses	`https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1`	Built-in tools, simplified multi-turn
Embedding	`https://dashscope-intl.aliyuncs.com/compatible-mode/v1`	Text embeddings
File	`https://dashscope-intl.aliyuncs.com/compatible-mode/v1`	File upload and management
Batch	`https://dashscope-intl.aliyuncs.com/compatible-mode/v1`	Asynchronous bulk processing at 50% cost
Conversations	`https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1`	Auto-managed multi-turn context

The Responses API and Conversations API use a different base_url from the other four APIs. Make sure you set the correct base_url for the API you are calling.

Chat Completions

The Chat Completions API (/v1/chat/completions) is largely compatible with OpenAI's Chat API. The key differences are listed below.

Qwen-specific parameters

These parameters are not part of the OpenAI standard. In the OpenAI Python SDK, pass them via extra_body.

Parameter	Type	Description
`enable_thinking`	Boolean	Enable deep reasoning mode. Some models require streaming. See Thinking.
`thinking_budget`	Integer	Max tokens for the thinking process. Same streaming requirements as `enable_thinking`.
`enable_search`	Boolean	Enable web search. Replaces OpenAI's `web_search_options`.
`search_options`	Object	Configure search behavior (strategy, forced search, etc.).
`top_k`	Integer	Sampling candidate set size. Range: (0, 100].
`vl_high_resolution_images`	Boolean	Enable high-resolution mode for vision models.
`enable_code_interpreter`	Boolean	Enable code interpreter. Streaming required (not required for Responses API).

Behavioral differences

response_format supports json_object only (no json_schema).
tool_choice supports auto, none, and specific function object ({"type": "function", "function": {"name": "..."}}). The required value is not supported.
tools supports function type only.
parallel_tool_calls defaults to false (OpenAI defaults to true).
n supports 1-4 and is limited to specific models (qwen-plus, qwen-plus-character).
web_search_options is not supported. Use extra_body.enable_search and extra_body.search_options instead.

Unsupported parameters

The following parameters are silently ignored: frequency_penalty, logit_bias, max_completion_tokens, metadata, prediction, prompt_cache_key, reasoning_effort, service_tier, store, verbosity. For the full API reference and code examples, see Chat Completions.

Responses API

The Responses API uses a different base_url: https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1. Compared to Chat Completions, the Responses API offers:

Built-in tools: web_search, code_interpreter, web_extractor, and image_search -- no external tool setup required.
Simplified multi-turn: Pass previous_response_id instead of building a full message history.
Conversation integration: Pair with the Conversations API for automatic context management.
Session cache: Automatically caches context across turns to reduce latency and cost. Enable with the x-dashscope-session-cache: enable header. See Session cache.

Migrate from Chat Completions

To switch from Chat Completions to the Responses API:

Change base_url to https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1 and the endpoint path from /v1/chat/completions to /v1/responses.
Read the response with output_text instead of choices[0].message.content.
For multi-turn conversations, pass previous_response_id instead of manually appending messages.

For the full reference and code examples, see Responses.

Embedding

The Embedding API (/v1/embeddings) is compatible with OpenAI's Embedding API. Key differences:

encoding_format: Only float is supported (default and only option).
user: Not supported.
dimensions: Available values depend on the model. For example, text-embedding-v4 supports 2,048, 1,536, 1,024 (default), 768, 512, 256, 128, and 64.

For supported models and code examples, see Embedding.

File API

The File API (/v1/files) is compatible with OpenAI's Files API, with these differences:

purpose must be file-extract (for document analysis with Qwen-Long/Qwen-Doc) or batch (for batch processing). OpenAI values like fine-tune and assistants are not supported.
File content retrieval (GET /v1/files/{file_id}/content) is not supported.
List filtering: The purpose and order parameters on GET /v1/files are not supported.
Storage limits: 10,000 files, 100 GB total. Files never expire.

For the full reference, see File.

Batch API

The Batch API (/v1/batches) is compatible with OpenAI's Batch API, with these differences:

50% cost discount compared to real-time pricing.
completion_window: Supports 24h to 336h (14 days). Accepts "h" (hours) and "d" (days) units with integer values. OpenAI is fixed at 24h.
Extra metadata: metadata.ds_name (task name) and metadata.ds_description (task description).
Extra list filters: ds_name, input_file_ids, status, create_after, create_before.
Input file limits: Up to 50,000 requests per file, 500 MB total, 6 MB per line. All requests in a file must use the same model.

For the full workflow guide, see Batch API.

Conversations API

The Conversations API is a Qwen-specific feature with no direct OpenAI equivalent. It automatically manages multi-turn context across devices and sessions. It uses the same base_url as the Responses API: https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1. Use it with the Responses API to inject historical context without manual message synchronization. For the full reference, see Conversations.

​Quick migration

​Supported APIs

​Chat Completions

​Qwen-specific parameters

​Behavioral differences

​Unsupported parameters

​Responses API

​Migrate from Chat Completions

​Embedding

​File API

​Batch API

​Conversations API

Quick migration

Supported APIs

Chat Completions

Qwen-specific parameters

Behavioral differences

Unsupported parameters

Responses API

Migrate from Chat Completions

Embedding

File API

Batch API

Conversations API