Skip to main content
Chat Models

Anthropic Messages API

Call Qwen models using Anthropic SDKs

POST
/apps/anthropic/v1/messages
import anthropic
import os

client = anthropic.Anthropic(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/apps/anthropic",
)

message = client.messages.create(
    model="qwen3.6-plus",
    max_tokens=1024,
    system="You are a helpful assistant",
    messages=[
        {
            "role": "user",
            "content": "你是谁?"
        }
    ],
    thinking={"type": "disabled"},
)

print(message.content[0].text)
{
  "id": "msg_e2898f19-fc0e-4cb3-bd9b-5b7dc4ea3bc9",
  "type": "message",
  "role": "assistant",
  "model": "qwen3.6-plus",
  "content": [
    {
      "type": "thinking",
      "thinking": "Let me analyze this question...",
      "signature": ""
    },
    {
      "type": "text",
      "text": "Hello! I am Qwen..."
    }
  ],
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 22,
    "output_tokens": 223,
    "cache_creation_input_tokens": 0,
    "cache_read_input_tokens": 0
  }
}

Authorizations

string
header
required

Qwen Cloud API key passed via x-api-key header. Authorization: Bearer header is also supported.

Body

application/json
string
required

Model name. Supported models:

Qwen Max: qwen3.6-max-preview, qwen3-max, qwen3-max-2026-01-23, qwen3-max-preview

Qwen Plus: qwen3.6-plus, qwen3.6-plus-2026-04-02, qwen3.5-plus, qwen3.5-plus-2026-04-20, qwen3.5-plus-2026-02-15, qwen-plus, qwen-plus-latest, qwen-plus-2025-09-11

Qwen Flash: qwen3.6-flash, qwen3.6-flash-2026-04-16, qwen3.5-flash, qwen3.5-flash-2026-02-23, qwen-flash, qwen-flash-2025-07-28

Qwen Turbo: qwen-turbo, qwen-turbo-latest

Qwen Coder: qwen3-coder-next, qwen3-coder-plus, qwen3-coder-plus-2025-09-23, qwen3-coder-flash

Qwen VL: qwen3-vl-plus, qwen3-vl-flash, qwen-vl-max, qwen-vl-plus

Qwen Open-source: qwen3.6-27b, qwen3.5-397b-a17b, qwen3.5-122b-a10b, qwen3.5-27b, qwen3.5-35b-a3b

Third-party models: deepseek-v4-pro, deepseek-v4-flash, deepseek-v3.2

integer
required

Maximum number of tokens to generate.

object[]
required

Message array, alternating between user and assistant turns.

string

System prompt to set the model's role or behavior. Passed as a top-level parameter; the messages array does not accept the system role. A string is equivalent to a single type="text" content block. To use context caching, pass an array of content blocks with cache_control.

boolean
defaultfalse

Enable streaming output. Default is false.

number

Controls diversity of generated text, range [0, 2). Higher values produce more random output. This range differs from Anthropic's native [0.0, 1.0] — verify this parameter when migrating from Anthropic.

number

Probability threshold for nucleus sampling. Both temperature and top_p control diversity — set only one at a time.

integer

Size of the sampling candidate set during generation.

string[]

Text sequences that stop generation. The model stops before outputting the sequence. When hit, stop_reason is still end_turn and the matched sequence is not included in the response.

object

Thinking configuration. When enabled, the model reasons before generating a reply to improve accuracy. The response will include thinking type content blocks.

enum<string>

Controls reasoning intensity. Default is max. Supported models: deepseek-v4-pro, deepseek-v4-flash. Values low or medium are mapped to high; xhigh is mapped to max.

high,max
object[]

Tool definition array for function calling.

object

Tool choice strategy. {"type": "auto"}: model decides whether to call tools (default). {"type": "any"}: force calling any tool. {"type": "none"}: disable tool calling. {"type": "tool", "name": "tool_name"}: force calling a specific tool.

Response

200-application/json
string

Unique message identifier.

enum<string>

Always message.

message
enum<string>

Always assistant.

assistant
string

Model name used.

object[]

Content array. Element types can be text, thinking (returned when thinking is enabled), or tool_use (tool call).

enum<string>

Stop reason: end_turn (normal completion), max_tokens (token limit reached), tool_use (tool call).

end_turn,max_tokens,tool_use
string | null

Always null.

object

Token usage statistics. In streaming, the usage in the message_start event only contains input_tokens and output_tokens; all 4 fields appear in the message_delta event.