Skip to main content
Text generation

Text generation models

Choose a model for AI agents, chatbots, document processing, and more.

Using OpenClaw or Claude Code?

qwen3.6-plus — strongest reasoning, full tool support, 1M context for large codebases. Coding Plan users can also choose kimi-k2.5, glm-5, or MiniMax-M2.5, all fine-tuned for agent workflows.

For other applications

Chatbots, content generation, summarization, document processing — start with qwen3.6-plus for strongest accuracy, 1M context, and the full feature set. Once your use case works well, try qwen3.5-flash to reduce cost — near-flagship quality with the same context and features.

Context window

1M tokens is roughly 750,000 words or 10 novels.
  • Long documents or large codebases → qwen3.6-plus / qwen3.5-flash (1M)
  • Standard tasks → 128k–256k is plenty

Thinking mode

Step-by-step reasoning for multi-step math, debugging, architecture planning, or legal cross-referencing. Toggle with enable_thinking. All Qwen3+ models support it — most are hybrid, so you can switch per request.

Function calling + built-in tools

Let the model take actions: check weather, query a database, book a meeting.
  • Function calling (you define tools, model calls them): all general-purpose models
  • Built-in tools (web search, code execution — no setup): qwen3.6-plus, qwen3.5-plus, qwen3.5-flash, qwen3-max only

Structured output

Get valid JSON back — e.g., extract names and dates from text. Qwen3.5, Qwen3, Qwen3-Coder, Qwen2.5, and legacy (Plus/Max/Flash/Turbo) — non-thinking mode.

Batch

Thousands of requests, latency not critical. Lowers cost per request. Legacy models only: qwen-plus, qwen-flash, qwen-turbo.
ModelContextThinkingFunction callingBuilt-in toolsStructured outputBatch
qwen3.6-plus1M
qwen3.5-flash1M
qwen3-max256k
deepseek-v3.2128k
kimi-k2.5256k
glm-5198k
MiniMax-M2.5192k
† Coding Plan only — not available for pay-as-you-go API access. See Coding Plan.

All models

Model IDContextMax OutputThinking BudgetFunction callingBuilt-in toolsStructured outputBatchCoding Plan
qwen3.6-plus1M64k80k
qwen3.6-plus-2026-04-021M64k80k
Model IDContextMax OutputThinking BudgetFunction callingBuilt-in toolsStructured outputBatchCoding Plan
qwen3.5-plus1M64k80k
qwen3.5-plus-2026-02-151M64k80k
qwen3.5-flash1M64k80k
qwen3.5-flash-2026-02-231M64k80k
qwen3.5-397b-a17b256k64k80k
qwen3.5-122b-a10b256k64k80k
qwen3.5-27b256k64k80k
qwen3.5-35b-a3b256k64k80k
Model IDContextMax OutputThinking BudgetFunction callingBuilt-in toolsStructured outputBatch
qwen3-max256k64k80k
qwen3-max-2026-01-23256k64k80k
qwen3-max-preview256k64k80k
qwen3-max-2025-09-23256k64k
qwen3-235b-a22b128k16k38k
qwen3-235b-a22b-thinking-2507128k32k80k
qwen3-235b-a22b-instruct-2507128k32k
qwen3-next-80b-a3b-thinking128k32k80k
qwen3-next-80b-a3b-instruct128k32k
qwen3-32b128k16k38k
qwen3-30b-a3b128k16k38k
qwen3-30b-a3b-thinking-2507128k32k80k
qwen3-30b-a3b-instruct-2507128k32k
qwen3-14b128k8k38k
qwen3-8b128k8k38k
qwen3-4b128k8k38k
qwen3-1.7b32k8k30k
qwen3-0.6b32k8k30k
Model IDContextMax OutputFunction callingBuilt-in toolsStructured outputBatchCoding Plan
qwen3-coder-plus1M64k
qwen3-coder-plus-2025-09-231M64k
qwen3-coder-plus-2025-07-221M64k
qwen3-coder-flash1M64k
qwen3-coder-flash-2025-07-281M64k
qwen3-coder-next256k64k
qwen3-coder-480b-a35b-instruct256k64k
qwen3-coder-30b-a3b-instruct256k64k
Model IDContextMax OutputFunction callingBuilt-in toolsStructured outputBatch
qwen2.5-omni-7b32k8k
qwen2.5-vl-72b-instruct128k8k
qwen2.5-vl-32b-instruct128k8k
qwen2.5-vl-7b-instruct128k8k
qwen2.5-vl-3b-instruct128k8k
qwen2.5-72b-instruct32k8k
qwen2.5-32b-instruct32k8k
qwen2.5-14b-instruct32k8k
qwen2.5-14b-instruct-1m1M8k
qwen2.5-7b-instruct32k8k
qwen2.5-7b-instruct-1m1M8k
Non-Qwen models available through the same API.
kimi-k2.5, glm-5, glm-4.7, and MiniMax-M2.5 are available exclusively through Coding Plan and do not support pay-as-you-go API access.
Model IDContextMax OutputThinking BudgetFunction callingBuilt-in toolsStructured outputBatch
deepseek-v3.2128k64k32k
kimi-k2.5256k96k80k
glm-5198k16k32k
glm-4.7198k16k32k
MiniMax-M2.5192k32k32k *
* MiniMax-M2.5 shares a single 32k limit for both CoT and final output.
Model IDContextMax OutputFunction callingBuilt-in toolsStructured outputBatch
qwen-mt-plus16k8k
qwen-mt-turbo16k8k
qwen-mt-flash16k8k
qwen-mt-lite16k8k
Model IDContextMax OutputFunction callingBuilt-in toolsStructured outputBatch
qwen-plus-character32k4k
qwen-plus-character-ja8k4k
qwen-flash-character8k4k
Previous generation models. We recommend Qwen3.5 or Qwen3 for new projects.
Model IDContextMax OutputThinking BudgetFunction callingBuilt-in toolsStructured outputBatch
qwen-plus1M32k80k
qwen-plus-latest1M32k80k
qwen-plus-2025-12-011M32k80k
qwen-plus-2025-09-111M32k80k
qwen-plus-2025-07-281M32k80k
qwen-plus-2025-07-14128k16k80k
qwen-plus-2025-04-28128k16k80k
qwen-plus-2025-01-25128k8k
qwen-max32k8k
qwen-max-latest32k8k
qwen-max-2025-01-2532k8k
qwen-flash1M32k80k
qwen-flash-2025-07-281M32k80k
qwen-turbo128k16k38k
qwen-turbo-latest128k16k38k
qwen-turbo-2025-04-28128k16k38k
qwen-turbo-2024-11-011M8k
qwq-plus128k8k32k
qvq-max128k8k80k
qvq-max-latest128k8k80k
qvq-max-2025-03-25128k8k80k
qwen-omni-turbo32k2k80k
qwen-omni-turbo-latest32k2k80k
qwen-omni-turbo-2025-03-2632k2k80k

Learn more