Migrate from OpenAI by changing three parameters: base_url, api_key, and model.
Qwen Cloud provides OpenAI-compatible APIs. If you have existing code that uses the OpenAI SDK or REST API, you can switch to Qwen models by changing three parameters:
The Chat Completions API (
These parameters are not part of the OpenAI standard. In the OpenAI Python SDK, pass them via
The following parameters are silently ignored:
The Responses API uses a different base_url:
To switch from Chat Completions to the Responses API:
The Embedding API (
The File API (
The Batch API (
The Conversations API is a Qwen-specific feature with no direct OpenAI equivalent. It automatically manages multi-turn context across devices and sessions. It uses the same
base_url, api_key, and model.
Quick migration
- Python
- Node.js
- curl
Before you begin, get an API key and set it as an environment variable. If you use the OpenAI SDK, install it.
Supported APIs
| API | Base URL (for SDK) | Description |
|---|---|---|
| Chat Completions | https://dashscope-intl.aliyuncs.com/compatible-mode/v1 | Text generation, vision, function calling |
| Responses | https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1 | Built-in tools, simplified multi-turn |
| Embedding | https://dashscope-intl.aliyuncs.com/compatible-mode/v1 | Text embeddings |
| File | https://dashscope-intl.aliyuncs.com/compatible-mode/v1 | File upload and management |
| Batch | https://dashscope-intl.aliyuncs.com/compatible-mode/v1 | Asynchronous bulk processing at 50% cost |
| Conversations | https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1 | Auto-managed multi-turn context |
The Responses API and Conversations API use a different
base_url from the other four APIs. Make sure you set the correct base_url for the API you are calling.Chat Completions
The Chat Completions API (/v1/chat/completions) is largely compatible with OpenAI's Chat API. The key differences are listed below.
Qwen-specific parameters
These parameters are not part of the OpenAI standard. In the OpenAI Python SDK, pass them via extra_body.
| Parameter | Type | Description |
|---|---|---|
enable_thinking | Boolean | Enable deep reasoning mode. Limited model support. |
thinking_budget | Integer | Max tokens for the thinking process. |
enable_search | Boolean | Enable web search. Replaces OpenAI's web_search_options. |
search_options | Object | Configure search behavior (strategy, forced search, etc.). |
top_k | Integer | Sampling candidate set size. Range: (0, 100]. |
vl_high_resolution_images | Boolean | Enable high-resolution mode for vision models. |
enable_code_interpreter | Boolean | Enable code interpreter. |
Behavioral differences
response_formatsupportsjson_objectonly (nojson_schema).tool_choicesupportsauto,none, and specific function object ({"type": "function", "function": {"name": "..."}}). Therequiredvalue is not supported.toolssupportsfunctiontype only.parallel_tool_callsdefaults tofalse(OpenAI defaults totrue).nsupports 1-4 and is limited to specific models (qwen-plus, qwen-plus-character).web_search_optionsis not supported. Useextra_body.enable_searchandextra_body.search_optionsinstead.
Unsupported parameters
The following parameters are silently ignored: frequency_penalty, logit_bias, max_completion_tokens, metadata, prediction, prompt_cache_key, reasoning_effort, service_tier, store, verbosity.
For the full API reference and code examples, see Chat Completions.
Responses API
The Responses API uses a different base_url: https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1.
Compared to Chat Completions, the Responses API offers:
- Built-in tools:
web_search,code_interpreter,web_extractor, andimage_search-- no external tool setup required. - Simplified multi-turn: Pass
previous_response_idinstead of building a full message history. - Conversation integration: Pair with the Conversations API for automatic context management.
- Session cache: Automatically caches context across turns to reduce latency and cost. Enable with the
x-dashscope-session-cache: enableheader. See Session cache.
Migrate from Chat Completions
To switch from Chat Completions to the Responses API:
- Change
base_urltohttps://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1and the endpoint path from/v1/chat/completionsto/v1/responses. - Read the response with
output_textinstead ofchoices[0].message.content. - For multi-turn conversations, pass
previous_response_idinstead of manually appending messages.
Embedding
The Embedding API (/v1/embeddings) is compatible with OpenAI's Embedding API. Key differences:
encoding_format: Onlyfloatis supported (default and only option).user: Not supported.dimensions: Available values depend on the model. For example,text-embedding-v4supports 2,048, 1,536, 1,024 (default), 768, 512, 256, 128, and 64.
File API
The File API (/v1/files) is compatible with OpenAI's Files API, with these differences:
purposemust befile-extract(for document analysis with Qwen-Long/Qwen-Doc) orbatch(for batch processing). OpenAI values likefine-tuneandassistantsare not supported.- File content retrieval (
GET /v1/files/{file_id}/content) is not supported. - List filtering: The
purposeandorderparameters onGET /v1/filesare not supported. - Storage limits: 10,000 files, 100 GB total. Files never expire.
Batch API
The Batch API (/v1/batches) is compatible with OpenAI's Batch API, with these differences:
- 50% cost discount compared to real-time pricing.
completion_window: Supports 24h to 336h (14 days). Accepts "h" (hours) and "d" (days) units with integer values. OpenAI is fixed at 24h.- Extra metadata:
metadata.ds_name(task name) andmetadata.ds_description(task description). - Extra list filters:
ds_name,input_file_ids,status,create_after,create_before. - Input file limits: Up to 50,000 requests per file, 500 MB total, 6 MB per line. All requests in a file must use the same model.
Conversations API
The Conversations API is a Qwen-specific feature with no direct OpenAI equivalent. It automatically manages multi-turn context across devices and sessions. It uses the same base_url as the Responses API: https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1.
Use it with the Responses API to inject historical context without manual message synchronization.
For the full reference, see Conversations.