Skip to main content
Text generation

Text generation models

Choose a model for AI agents, chatbots, document processing, and more.

Using OpenClaw or Claude Code?

qwen3.6-plus — strongest reasoning, full tool support, 1M context for large codebases. Token Plan also includes glm-5, MiniMax-M2.5, deepseek-v4-pro, deepseek-v4-flash, and deepseek-v3.2.

For other applications

Chatbots, content generation, summarization, document processing — start with qwen3.6-plus for strongest accuracy, 1M context, and the full feature set. Once your use case works well, try qwen3.6-flash to reduce cost — near-flagship quality with the same context and features.

Context window

1M tokens is roughly 750,000 words or 10 novels.
  • Long documents or large codebases → qwen3.6-plus / qwen3.6-flash (1M)
  • Standard tasks → 128k–256k is plenty

Thinking mode

Step-by-step reasoning for multi-step math, debugging, architecture planning, or legal cross-referencing. Toggle with enable_thinking. All Qwen3+ models support it — most are hybrid, so you can switch per request.

Function calling + built-in tools

Let the model take actions: check weather, query a database, book a meeting.
  • Function calling (you define tools, model calls them): all general-purpose models
  • Built-in tools (web search, code execution — no setup): qwen3.6-plus, qwen3.6-flash, qwen3.5-plus, qwen3.5-flash, qwen3-max only

Structured output

Get valid JSON back — e.g., extract names and dates from text. Qwen3.6, Qwen3.5, Qwen3, Qwen3-Coder, Qwen2.5, and legacy (Plus/Max/Flash/Turbo) — non-thinking mode.

Batch

Thousands of requests, latency not critical. Lowers cost per request. Legacy models only: qwen-plus, qwen-flash, qwen-turbo.
ModelContextThinkingFunction callingBuilt-in toolsStructured outputBatchExplicit cacheImplicit cacheSession cache
qwen3.6-plus1M
qwen3.6-flash1M
qwen3.6-max-preview256k
deepseek-v4-pro1M
deepseek-v4-flash1M
deepseek-v3.2128k
kimi-k2.5256k
glm-5198k
MiniMax-M2.5192k
Coding Plan only — not available for pay-as-you-go or Token Plan. ‡ Available through Token Plan and Coding Plan, not pay-as-you-go.

All models

Model IDContextMax OutputThinking BudgetFunction callingBuilt-in toolsStructured outputBatchToken PlanExplicit cacheImplicit cacheSession cache
qwen3.6-max-preview256k64k128k
qwen3.6-flash1M64k80k
qwen3.6-flash-2026-04-161M64k80k
qwen3.6-35b-a3b256k64k80k
qwen3.6-27b256k64k80k
qwen3.6-plus1M64k80k
qwen3.6-plus-2026-04-021M64k80k
Model IDContextMax OutputThinking BudgetFunction callingBuilt-in toolsStructured outputBatchCoding PlanExplicit cacheImplicit cacheSession cache
qwen3.5-plus1M64k80k
qwen3.5-plus-2026-04-201M64k80k
qwen3.5-plus-2026-02-151M64k80k
qwen3.5-flash1M64k80k
qwen3.5-flash-2026-02-231M64k80k
qwen3.5-397b-a17b256k64k80k
qwen3.5-122b-a10b256k64k80k
qwen3.5-27b256k64k80k
qwen3.5-35b-a3b256k64k80k

Translation

Model IDContextMax OutputFunction callingBuilt-in toolsStructured outputBatchExplicit cacheImplicit cacheSession cache
qwen-mt-plus16k8k
qwen-mt-turbo16k8k
qwen-mt-flash16k8k
qwen-mt-lite16k8k

Character roleplay

Model IDContextMax OutputFunction callingBuilt-in toolsStructured outputBatchExplicit cacheImplicit cacheSession cache
qwen-plus-character32k4k
qwen-plus-character-ja8k4k
qwen-flash-character8k4k
Non-Qwen models available through the same API.
deepseek-v4-pro and deepseek-v4-flash are available through pay-as-you-go. deepseek-v3.2 is available through pay-as-you-go, Token Plan, and Coding Plan. glm-5 and MiniMax-M2.5 are available through Token Plan and Coding Plan. kimi-k2.5 and glm-4.7 are available exclusively through Coding Plan.
Model IDContextMax OutputThinking BudgetFunction callingBuilt-in toolsStructured outputBatchExplicit cacheImplicit cacheSession cache
deepseek-v4-pro1M384k **
deepseek-v4-flash1M384k **
deepseek-v3.2128k64k32k
kimi-k2.5256k96k80k
glm-5198k16k32k
glm-4.7198k16k32k
MiniMax-M2.5192k32k32k *
* DeepSeek V4 models share a 384k total budget across output and thinking. MiniMax-M2.5 shares a single 32k limit for both CoT and final output.
Previous generation models. We recommend Qwen3.6 for new projects.

Qwen3

Model IDContextMax OutputThinking BudgetFunction callingBuilt-in toolsStructured outputBatchExplicit cacheImplicit cacheSession cache
qwen3-max256k64k80k
qwen3-max-2026-01-23256k64k80k
qwen3-max-preview256k64k80k
qwen3-max-2025-09-23256k64k
qwen3-235b-a22b128k16k38k
qwen3-235b-a22b-thinking-2507128k32k80k
qwen3-235b-a22b-instruct-2507128k32k
qwen3-next-80b-a3b-thinking128k32k80k
qwen3-next-80b-a3b-instruct128k32k
qwen3-32b128k16k38k
qwen3-30b-a3b128k16k38k
qwen3-30b-a3b-thinking-2507128k32k80k
qwen3-30b-a3b-instruct-2507128k32k
qwen3-14b128k8k38k
qwen3-8b128k8k38k
qwen3-4b128k8k38k
qwen3-1.7b32k8k30k
qwen3-0.6b32k8k30k

Qwen3-Coder

Model IDContextMax OutputFunction callingBuilt-in toolsStructured outputBatchExplicit cacheImplicit cacheSession cache
qwen3-coder-plus1M64k
qwen3-coder-plus-2025-09-231M64k
qwen3-coder-plus-2025-07-221M64k
qwen3-coder-flash1M64k
qwen3-coder-flash-2025-07-281M64k
qwen3-coder-next256k64k
qwen3-coder-480b-a35b-instruct256k64k
qwen3-coder-30b-a3b-instruct256k64k

Qwen2.5 (open source)

Model IDContextMax OutputFunction callingBuilt-in toolsStructured outputBatchExplicit cacheImplicit cacheSession cache
qwen2.5-omni-7b32k8k
qwen2.5-vl-72b-instruct128k8k
qwen2.5-vl-32b-instruct128k8k
qwen2.5-vl-7b-instruct128k8k
qwen2.5-vl-3b-instruct128k8k
qwen2.5-72b-instruct32k8k
qwen2.5-32b-instruct32k8k
qwen2.5-14b-instruct32k8k
qwen2.5-14b-instruct-1m1M8k
qwen2.5-7b-instruct32k8k
qwen2.5-7b-instruct-1m1M8k

Legacy (qwen-plus/max/flash/turbo)

Model IDContextMax OutputThinking BudgetFunction callingBuilt-in toolsStructured outputBatchExplicit cacheImplicit cacheSession cache
qwen-plus1M32k80k
qwen-plus-latest1M32k80k
qwen-plus-2025-12-011M32k80k
qwen-plus-2025-09-111M32k80k
qwen-plus-2025-07-281M32k80k
qwen-plus-2025-07-14128k16k80k
qwen-plus-2025-04-28128k16k80k
qwen-plus-2025-01-25128k8k
qwen-max32k8k
qwen-max-latest32k8k
qwen-max-2025-01-2532k8k
qwen-flash1M32k80k
qwen-flash-2025-07-281M32k80k
qwen-turbo128k16k38k
qwen-turbo-latest128k16k38k
qwen-turbo-2025-04-28128k16k38k
qwen-turbo-2024-11-011M8k
qwq-plus128k8k32k
qvq-max128k8k80k
qvq-max-latest128k8k80k
qvq-max-2025-03-25128k8k80k
qwen-omni-turbo32k2k80k
qwen-omni-turbo-latest32k2k80k
qwen-omni-turbo-2025-03-2632k2k80k

Learn more

Text generation models - Qwen Cloud