Skip to main content
Third-party models

DeepSeek

Call DeepSeek models through the OpenAI-compatible API or DashScope SDK on Qwen Cloud.

This guide shows how to call DeepSeek models via the OpenAI-compatible API or DashScope SDK.
The models deepseek-v3, deepseek-v3.1, deepseek-v3.2, deepseek-v3.2-exp, deepseek-r1, deepseek-r1-0528, and deepseek-r1-distill-qwen-7b/14b/32b will be deprecated on July 9, 2026. Migrate to qwen3.7-plus, qwen3.7-max, or qwen3.6-flash.

Quick start

deepseek-v4-pro is the latest model in the DeepSeek series and delivers top-tier performance across coding, math, and general tasks. You can use the enable_thinking parameter to switch between thinking and non-thinking modes. The following example calls deepseek-v4-pro in thinking mode. Before you begin, get an API key and set it as an environment variable. If you call the model through an SDK, install the OpenAI or DashScope SDK.
  • OpenAI compatible
  • DashScope
The enable_thinking parameter is not part of the standard OpenAI API. In the OpenAI Python SDK, pass it through extra_body. In the Node.js SDK, pass it as a top-level parameter. The reasoning_effort parameter is a standard OpenAI parameter that you can pass directly as a top-level parameter.
  • Python
  • Node.js
  • curl
Example code
from openai import OpenAI
import os

client = OpenAI(
  api_key=os.getenv("DASHSCOPE_API_KEY"),
  base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)

messages = [{"role": "user", "content": "Who are you?"}]
completion = client.chat.completions.create(
  model="deepseek-v4-pro",
  messages=messages,
  extra_body={"enable_thinking": True},
  stream=True,
  stream_options={"include_usage": True},
)

reasoning_content = ""
answer_content = ""
is_answering = False
print("\n" + "=" * 20 + "Thinking process" + "=" * 20 + "\n")

for chunk in completion:
  if not chunk.choices:
    print("\n" + "=" * 20 + "Token usage" + "=" * 20 + "\n")
    print(chunk.usage)
    continue

  delta = chunk.choices[0].delta

  if hasattr(delta, "reasoning_content") and delta.reasoning_content is not None:
    if not is_answering:
      print(delta.reasoning_content, end="", flush=True)
    reasoning_content += delta.reasoning_content

  if hasattr(delta, "content") and delta.content:
    if not is_answering:
      print("\n" + "=" * 20 + "Full response" + "=" * 20 + "\n")
      is_answering = True
    print(delta.content, end="", flush=True)
    answer_content += delta.content

Reasoning effort

deepseek-v4-pro and deepseek-v4-flash have thinking mode enabled by default. You can use the reasoning_effort parameter to control reasoning intensity. Valid values: high and max. The default value is high.
If you set this parameter to low or medium, it is mapped to high. If you set it to xhigh, it is mapped to max.
  • OpenAI compatible
  • DashScope
  • Python
  • Node.js
  • curl
from openai import OpenAI
import os

client = OpenAI(
  api_key=os.getenv("DASHSCOPE_API_KEY"),
  base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)

completion = client.chat.completions.create(
  model="deepseek-v4-pro",
  messages=[{"role": "user", "content": "Which is larger, 9.9 or 9.11?"}],
  reasoning_effort="high",
)
print(completion.choices[0].message.content)

Other features

ModelMulti-turnFunction callingWeb searchContext cacheStructured output
deepseek-v4-pro
deepseek-v4-flash
deepseek-v3.2

Parameter defaults

Modeltemperaturetop_prepetition_penaltypresence_penaltymax_tokensthinking_budget
deepseek-v4-pro1.01.0--393,216 shared393,216 shared
deepseek-v4-flash1.01.0--393,216 shared393,216 shared
deepseek-v3.21.00.95--65,53632,768
  • A hyphen (-) indicates that the parameter is not supported.
  • The deepseek-r1, deepseek-r1-0528, and distilled models do not support overriding their default parameter values.
  • For parameter descriptions, see the OpenAI-compatible Chat API.