Skip to main content
Toolkit & Framework

OpenAI compatibility

Migrate from OpenAI by changing three parameters: base_url, api_key, and model.

Qwen Cloud provides OpenAI-compatible APIs. If you have existing code that uses the OpenAI SDK or REST API, you can switch to Qwen models by changing three parameters: base_url, api_key, and model.

Quick migration

  • Python
  • Node.js
  • curl
import os
from openai import OpenAI

client = OpenAI(
  api_key=os.getenv("DASHSCOPE_API_KEY"),
  base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)

completion = client.chat.completions.create(
  model="qwen3.6-plus",
  messages=[{"role": "user", "content": "Hello!"}],
)
print(completion.choices[0].message.content)
Before you begin, get an API key and set it as an environment variable. If you use the OpenAI SDK, install it.

Supported APIs

APIBase URL (for SDK)Description
Chat Completionshttps://dashscope-intl.aliyuncs.com/compatible-mode/v1Text generation, vision, function calling
Responseshttps://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1Built-in tools, simplified multi-turn
Embeddinghttps://dashscope-intl.aliyuncs.com/compatible-mode/v1Text embeddings
Filehttps://dashscope-intl.aliyuncs.com/compatible-mode/v1File upload and management
Batchhttps://dashscope-intl.aliyuncs.com/compatible-mode/v1Asynchronous bulk processing at 50% cost
Conversationshttps://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1Auto-managed multi-turn context
The Responses API and Conversations API use a different base_url from the other four APIs. Make sure you set the correct base_url for the API you are calling.

Chat Completions

The Chat Completions API (/v1/chat/completions) is largely compatible with OpenAI's Chat API. The key differences are listed below.

Qwen-specific parameters

These parameters are not part of the OpenAI standard. In the OpenAI Python SDK, pass them via extra_body.
ParameterTypeDescription
enable_thinkingBooleanEnable deep reasoning mode. Limited model support.
thinking_budgetIntegerMax tokens for the thinking process.
enable_searchBooleanEnable web search. Replaces OpenAI's web_search_options.
search_optionsObjectConfigure search behavior (strategy, forced search, etc.).
top_kIntegerSampling candidate set size. Range: (0, 100].
vl_high_resolution_imagesBooleanEnable high-resolution mode for vision models.
enable_code_interpreterBooleanEnable code interpreter.

Behavioral differences

  • response_format supports json_object only (no json_schema).
  • tool_choice supports auto, none, and specific function object ({"type": "function", "function": {"name": "..."}}). The required value is not supported.
  • tools supports function type only.
  • parallel_tool_calls defaults to false (OpenAI defaults to true).
  • n supports 1-4 and is limited to specific models (qwen-plus, qwen-plus-character).
  • web_search_options is not supported. Use extra_body.enable_search and extra_body.search_options instead.

Unsupported parameters

The following parameters are silently ignored: frequency_penalty, logit_bias, max_completion_tokens, metadata, prediction, prompt_cache_key, reasoning_effort, service_tier, store, verbosity. For the full API reference and code examples, see Chat Completions.

Responses API

The Responses API uses a different base_url: https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1. Compared to Chat Completions, the Responses API offers:
  • Built-in tools: web_search, code_interpreter, web_extractor, and image_search -- no external tool setup required.
  • Simplified multi-turn: Pass previous_response_id instead of building a full message history.
  • Conversation integration: Pair with the Conversations API for automatic context management.
  • Session cache: Automatically caches context across turns to reduce latency and cost. Enable with the x-dashscope-session-cache: enable header. See Session cache.

Migrate from Chat Completions

To switch from Chat Completions to the Responses API:
  1. Change base_url to https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1 and the endpoint path from /v1/chat/completions to /v1/responses.
  2. Read the response with output_text instead of choices[0].message.content.
  3. For multi-turn conversations, pass previous_response_id instead of manually appending messages.
For the full reference and code examples, see Responses.

Embedding

The Embedding API (/v1/embeddings) is compatible with OpenAI's Embedding API. Key differences:
  • encoding_format: Only float is supported (default and only option).
  • user: Not supported.
  • dimensions: Available values depend on the model. For example, text-embedding-v4 supports 2,048, 1,536, 1,024 (default), 768, 512, 256, 128, and 64.
For supported models and code examples, see Embedding.

File API

The File API (/v1/files) is compatible with OpenAI's Files API, with these differences:
  • purpose must be file-extract (for document analysis with Qwen-Long/Qwen-Doc) or batch (for batch processing). OpenAI values like fine-tune and assistants are not supported.
  • File content retrieval (GET /v1/files/{file_id}/content) is not supported.
  • List filtering: The purpose and order parameters on GET /v1/files are not supported.
  • Storage limits: 10,000 files, 100 GB total. Files never expire.
For the full reference, see File.

Batch API

The Batch API (/v1/batches) is compatible with OpenAI's Batch API, with these differences:
  • 50% cost discount compared to real-time pricing.
  • completion_window: Supports 24h to 336h (14 days). Accepts "h" (hours) and "d" (days) units with integer values. OpenAI is fixed at 24h.
  • Extra metadata: metadata.ds_name (task name) and metadata.ds_description (task description).
  • Extra list filters: ds_name, input_file_ids, status, create_after, create_before.
  • Input file limits: Up to 50,000 requests per file, 500 MB total, 6 MB per line. All requests in a file must use the same model.
For the full workflow guide, see Batch API.

Conversations API

The Conversations API is a Qwen-specific feature with no direct OpenAI equivalent. It automatically manages multi-turn context across devices and sessions. It uses the same base_url as the Responses API: https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1. Use it with the Responses API to inject historical context without manual message synchronization. For the full reference, see Conversations.