Text generation models

Using OpenClaw or Claude Code?

qwen3.6-plus — strongest reasoning, full tool support, 1M context for large codebases. Token Plan also includes glm-5, MiniMax-M2.5, deepseek-v4-pro, deepseek-v4-flash, and deepseek-v3.2.

For other applications

Chatbots, content generation, summarization, document processing — start with qwen3.6-plus for strongest accuracy, 1M context, and the full feature set. Once your use case works well, try qwen3.6-flash to reduce cost — near-flagship quality with the same context and features.

Context window

1M tokens is roughly 750,000 words or 10 novels.

Long documents or large codebases → qwen3.6-plus / qwen3.6-flash (1M)
Standard tasks → 128k–256k is plenty

Thinking mode

Step-by-step reasoning for multi-step math, debugging, architecture planning, or legal cross-referencing. Toggle with enable_thinking. All Qwen3+ models support it — most are hybrid, so you can switch per request.

Function calling + built-in tools

Let the model take actions: check weather, query a database, book a meeting.

Function calling (you define tools, model calls them): all general-purpose models
Built-in tools (web search, code execution — no setup): qwen3.6-plus, qwen3.6-flash, qwen3.5-plus, qwen3.5-flash, qwen3-max only

Structured output

Get valid JSON back — e.g., extract names and dates from text. Qwen3.6, Qwen3.5, Qwen3, Qwen3-Coder, Qwen2.5, and legacy (Plus/Max/Flash/Turbo) — non-thinking mode.

Batch

Thousands of requests, latency not critical. Lowers cost per request. Legacy models only: qwen-plus, qwen-flash, qwen-turbo.

Recommended models

Model	Context	Thinking	Function calling	Built-in tools	Structured output	Batch	Explicit cache	Implicit cache	Session cache
`qwen3.6-plus`	1M	✓	✓	✓	✓	—	✓	—	✓
`qwen3.6-flash`	1M	✓	✓	✓	✓	—	✓	—	✓
`qwen3.6-max-preview`	256k	✓	✓	—	✓	—	✓	—	—
`deepseek-v4-pro`	1M	✓	✓	—	—	—	✓	—	—
`deepseek-v4-flash`	1M	✓	✓	—	—	—	✓	—	—
`deepseek-v3.2`	128k	✓	✓	—	—	—	—	—	—
`kimi-k2.5` †	256k	✓	✓	—	—	—	—	—	—
`glm-5` ‡	198k	✓	✓	—	✓	—	—	—	—
`MiniMax-M2.5` ‡	192k	✓	✓	—	—	—	—	—	—

† Coding Plan only — not available for pay-as-you-go or Token Plan. ‡ Available through Token Plan and Coding Plan, not pay-as-you-go.

All models

Qwen3.6

Model ID	Context	Max Output	Thinking Budget	Function calling	Built-in tools	Structured output	Batch	Token Plan	Explicit cache	Implicit cache	Session cache
`qwen3.6-max-preview`	256k	64k	128k	✓	—	✓	—	—	✓	—	—
`qwen3.6-flash`	1M	64k	80k	✓	✓	✓	—	—	✓	—	✓
`qwen3.6-flash-2026-04-16`	1M	64k	80k	✓	✓	✓	—	—	—	—	—
`qwen3.6-35b-a3b`	256k	64k	80k	✓	✓	✓	—	—	—	—	—
`qwen3.6-27b`	256k	64k	80k	✓	—	✓	—	—	—	—	—
`qwen3.6-plus`	1M	64k	80k	✓	✓	✓	—	✓	✓	—	✓
`qwen3.6-plus-2026-04-02`	1M	64k	80k	✓	✓	✓	—	—	—	—	—

Qwen3.5

Model ID	Context	Max Output	Thinking Budget	Function calling	Built-in tools	Structured output	Batch	Coding Plan	Explicit cache	Implicit cache	Session cache
`qwen3.5-plus`	1M	64k	80k	✓	✓	✓	—	✓	✓	—	✓
`qwen3.5-plus-2026-04-20`	1M	64k	80k	✓	✓	✓	—	—	✓	—	—
`qwen3.5-plus-2026-02-15`	1M	64k	80k	✓	✓	✓	—	—	—	—	—
`qwen3.5-flash`	1M	64k	80k	✓	✓	✓	—	—	✓	—	✓
`qwen3.5-flash-2026-02-23`	1M	64k	80k	✓	✓	✓	—	—	—	—	—
`qwen3.5-397b-a17b`	256k	64k	80k	✓	✓	✓	—	—	—	—	—
`qwen3.5-122b-a10b`	256k	64k	80k	✓	✓	✓	—	—	—	—	—
`qwen3.5-27b`	256k	64k	80k	✓	✓	✓	—	—	—	—	—
`qwen3.5-35b-a3b`	256k	64k	80k	✓	✓	✓	—	—	—	—	—

Specialized

Translation

Model ID	Context	Max Output	Function calling	Built-in tools	Structured output	Batch	Explicit cache	Implicit cache	Session cache
`qwen-mt-plus`	16k	8k	—	—	—	—	—	—	—
`qwen-mt-turbo`	16k	8k	—	—	—	—	—	—	—
`qwen-mt-flash`	16k	8k	—	—	—	—	—	—	—
`qwen-mt-lite`	16k	8k	—	—	—	—	—	—	—

Character roleplay

Model ID	Context	Max Output	Function calling	Built-in tools	Structured output	Batch	Explicit cache	Implicit cache	Session cache
`qwen-plus-character`	32k	4k	—	—	—	—	—	✓	—
`qwen-plus-character-ja`	8k	4k	—	—	—	—	—	✓	—
`qwen-flash-character`	8k	4k	—	—	—	—	—	✓	—

Third-party

Non-Qwen models available through the same API.

deepseek-v4-pro and deepseek-v4-flash are available through pay-as-you-go. deepseek-v3.2 is available through pay-as-you-go, Token Plan, and Coding Plan. glm-5 and MiniMax-M2.5 are available through Token Plan and Coding Plan. kimi-k2.5 and glm-4.7 are available exclusively through Coding Plan.

Model ID	Context	Max Output	Thinking Budget	Function calling	Built-in tools	Structured output	Batch	Explicit cache	Implicit cache	Session cache
`deepseek-v4-pro`	1M	384k *	*	✓	—	—	—	✓	—	—
`deepseek-v4-flash`	1M	384k *	*	✓	—	—	—	✓	—	—
`deepseek-v3.2`	128k	64k	32k	✓	—	—	—	—	—	—
`kimi-k2.5`	256k	96k	80k	✓	—	—	—	—	—	—
`glm-5`	198k	16k	32k	✓	—	✓	—	—	—	—
`glm-4.7`	198k	16k	32k	✓	—	✓	—	—	—	—
`MiniMax-M2.5`	192k	32k	32k *	✓	—	—	—	—	—	—

* DeepSeek V4 models share a 384k total budget across output and thinking. MiniMax-M2.5 shares a single 32k limit for both CoT and final output.

Legacy

Previous generation models. We recommend Qwen3.6 for new projects.

Qwen3

Model ID	Context	Max Output	Thinking Budget	Function calling	Built-in tools	Structured output	Batch	Explicit cache	Implicit cache	Session cache
`qwen3-max`	256k	64k	80k	✓	✓	✓	—	✓	✓	✓
`qwen3-max-2026-01-23`	256k	64k	80k	✓	✓	✓	—	—	—	—
`qwen3-max-preview`	256k	64k	80k	✓	✓	✓	—	—	✓	—
`qwen3-max-2025-09-23`	256k	64k	—	✓	✓	✓	—	—	—	—
`qwen3-235b-a22b`	128k	16k	38k	✓	—	✓	—	—	—	—
`qwen3-235b-a22b-thinking-2507`	128k	32k	80k	✓	—	—	—	—	—	—
`qwen3-235b-a22b-instruct-2507`	128k	32k	—	✓	—	✓	—	—	—	—
`qwen3-next-80b-a3b-thinking`	128k	32k	80k	✓	—	—	—	—	—	—
`qwen3-next-80b-a3b-instruct`	128k	32k	—	✓	—	✓	—	—	—	—
`qwen3-32b`	128k	16k	38k	✓	—	✓	—	—	—	—
`qwen3-30b-a3b`	128k	16k	38k	✓	—	✓	—	—	—	—
`qwen3-30b-a3b-thinking-2507`	128k	32k	80k	✓	—	—	—	—	—	—
`qwen3-30b-a3b-instruct-2507`	128k	32k	—	✓	—	✓	—	—	—	—
`qwen3-14b`	128k	8k	38k	✓	—	✓	—	—	—	—
`qwen3-8b`	128k	8k	38k	✓	—	✓	—	—	—	—
`qwen3-4b`	128k	8k	38k	✓	—	✓	—	—	—	—
`qwen3-1.7b`	32k	8k	30k	✓	—	✓	—	—	—	—
`qwen3-0.6b`	32k	8k	30k	✓	—	✓	—	—	—	—

Qwen3-Coder

Model ID	Context	Max Output	Function calling	Built-in tools	Structured output	Batch	Explicit cache	Implicit cache	Session cache
`qwen3-coder-plus`	1M	64k	✓	—	✓	—	✓	✓	✓
`qwen3-coder-plus-2025-09-23`	1M	64k	✓	—	✓	—	—	—	—
`qwen3-coder-plus-2025-07-22`	1M	64k	✓	—	✓	—	—	—	—
`qwen3-coder-flash`	1M	64k	✓	—	✓	—	✓	✓	✓
`qwen3-coder-flash-2025-07-28`	1M	64k	✓	—	✓	—	—	—	—
`qwen3-coder-next`	256k	64k	✓	—	✓	—	—	—	—
`qwen3-coder-480b-a35b-instruct`	256k	64k	✓	—	✓	—	—	—	—
`qwen3-coder-30b-a3b-instruct`	256k	64k	✓	—	✓	—	—	—	—

Qwen2.5 (open source)

Model ID	Context	Max Output	Function calling	Built-in tools	Structured output	Batch	Explicit cache	Implicit cache	Session cache
`qwen2.5-omni-7b`	32k	8k	✓	—	✓	—	—	—	—
`qwen2.5-vl-72b-instruct`	128k	8k	✓	—	✓	—	—	—	—
`qwen2.5-vl-32b-instruct`	128k	8k	✓	—	✓	—	—	—	—
`qwen2.5-vl-7b-instruct`	128k	8k	✓	—	✓	—	—	—	—
`qwen2.5-vl-3b-instruct`	128k	8k	✓	—	✓	—	—	—	—
`qwen2.5-72b-instruct`	32k	8k	✓	—	✓	—	—	—	—
`qwen2.5-32b-instruct`	32k	8k	✓	—	✓	—	—	—	—
`qwen2.5-14b-instruct`	32k	8k	✓	—	✓	—	—	—	—
`qwen2.5-14b-instruct-1m`	1M	8k	✓	—	✓	—	—	—	—
`qwen2.5-7b-instruct`	32k	8k	✓	—	✓	—	—	—	—
`qwen2.5-7b-instruct-1m`	1M	8k	✓	—	✓	—	—	—	—

Legacy (qwen-plus/max/flash/turbo)

Model ID	Context	Max Output	Thinking Budget	Function calling	Built-in tools	Structured output	Batch	Explicit cache	Implicit cache	Session cache
`qwen-plus`	1M	32k	80k	✓	—	✓	✓	✓	✓	✓
`qwen-plus-latest`	1M	32k	80k	✓	—	✓	✓	—	—	—
`qwen-plus-2025-12-01`	1M	32k	80k	✓	—	✓	✓	—	—	—
`qwen-plus-2025-09-11`	1M	32k	80k	✓	—	✓	✓	—	—	—
`qwen-plus-2025-07-28`	1M	32k	80k	✓	—	✓	✓	—	—	—
`qwen-plus-2025-07-14`	128k	16k	80k	✓	—	✓	✓	—	—	—
`qwen-plus-2025-04-28`	128k	16k	80k	✓	—	✓	✓	—	—	—
`qwen-plus-2025-01-25`	128k	8k	—	✓	—	✓	✓	—	—	—
`qwen-max`	32k	8k	—	✓	—	✓	✓	—	✓	—
`qwen-max-latest`	32k	8k	—	✓	—	✓	✓	—	—	—
`qwen-max-2025-01-25`	32k	8k	—	✓	—	✓	✓	—	—	—
`qwen-flash`	1M	32k	80k	✓	—	✓	✓	✓	✓	✓
`qwen-flash-2025-07-28`	1M	32k	80k	✓	—	✓	✓	—	—	—
`qwen-turbo`	128k	16k	38k	✓	—	✓	✓	—	✓	—
`qwen-turbo-latest`	128k	16k	38k	✓	—	✓	✓	—	—	—
`qwen-turbo-2025-04-28`	128k	16k	38k	✓	—	✓	✓	—	—	—
`qwen-turbo-2024-11-01`	1M	8k	—	✓	—	✓	✓	—	—	—
`qwq-plus`	128k	8k	32k	—	—	—	—	—	—	—
`qvq-max`	128k	8k	80k	—	—	—	—	—	—	—
`qvq-max-latest`	128k	8k	80k	—	—	—	—	—	—	—
`qvq-max-2025-03-25`	128k	8k	80k	—	—	—	—	—	—	—
`qwen-omni-turbo`	32k	2k	80k	—	—	—	—	—	—	—
`qwen-omni-turbo-latest`	32k	2k	80k	—	—	—	—	—	—	—
`qwen-omni-turbo-2025-03-26`	32k	2k	80k	—	—	—	—	—	—	—

Text generation models

Using OpenClaw or Claude Code?

For other applications

Context window

Thinking mode

Function calling + built-in tools

Structured output

Batch

Recommended models

All models

Translation

Character roleplay

Qwen3

Qwen3-Coder

Qwen2.5 (open source)

Legacy (qwen-plus/max/flash/turbo)

Learn more

Model selection guide

Try free

​Using OpenClaw or Claude Code?

​For other applications

​Context window

​Thinking mode

​Function calling + built-in tools

​Structured output

​Batch

​Recommended models

​All models

Translation

Character roleplay

Qwen3

Qwen3-Coder

Qwen2.5 (open source)

Legacy (qwen-plus/max/flash/turbo)

​Learn more

Model selection guide

Try free

Using OpenClaw or Claude Code?

For other applications

Context window

Thinking mode

Function calling + built-in tools

Structured output

Batch

Recommended models

All models

Learn more