Pay-as-you-go pricing for API usage
New users get a free quota to try models at no cost. See Free quota for details.
The prices listed below are list prices. For current promotions and discounted pricing, visit the Model Marketplace.
Text generation
Billed per million tokens. Models with long-context support use tiered pricing — the more input tokens in a single request, the higher the per-token rate.
| Model | Input per request | Input | Output |
|---|---|---|---|
| qwen3.7-max | 0 – 991K | $2.50 | $7.50 |
| qwen3.6-max-preview | ≤ 128K | $1.30 | $7.80 |
| 128K – 256K | $2.00 | $12.00 | |
| qwen3.7-plus | ≤ 256K | $0.40 | $1.60 |
| 256K – 1M | $1.20 | $4.80 | |
| qwen3.6-flash | ≤ 256K | $0.25 | $1.50 |
| 256K – 1M | $1.00 | $4.00 |
Images & videos
Understanding
Vision understanding is billed per token. Qwen text generation models (qwen3.7-plus, etc.) support vision input at the same token price listed above. Dedicated vision models have separate pricing:
| Model | Input per request | Input | Output |
|---|---|---|---|
| qwen3-vl-plus | ≤ 32K | $0.20 | $1.60 |
| 32K – 128K | $0.30 | $2.40 | |
| 128K – 256K | $0.60 | $4.80 | |
| qwen3-vl-flash | ≤ 32K | $0.05 | $0.40 |
| 32K – 128K | $0.075 | $0.60 | |
| 128K – 256K | $0.12 | $0.96 |
| Model family | Image conversion | Example (1024×1024) |
|---|---|---|
| Qwen (qwen3.7-plus, etc.) | 1 token per 32×32 pixels | ≈ 256 tokens |
| Qwen-VL (qwen3-vl, etc.) | 1 token per 32×32 pixels | ≈ 256 tokens |
| Qwen3.5-Omni | 1 token per 32×32 pixels | ≈ 256 tokens |
| Qwen3-Omni-Flash | 1 token per 32×32 pixels | ≈ 256 tokens |
Generation
Image generation is billed per image (resolution-independent). Video generation is billed per second of output video.
Image generation
| Model | Price per image |
|---|---|
| qwen-image-2.0-pro | $0.075 |
| qwen-image-2.0 | $0.035 |
| qwen-image-edit | $0.045 |
| wan2.6-t2i | $0.03 |
| wan2.6-image | $0.03 |
| z-image-turbo | $0.015 (prompt rewrite off) / $0.03 (on) |
| Model | Price per second |
|---|---|
| wan2.6-t2v | $0.10 |
| wan2.6-i2v | $0.10 |
| wan2.6-i2v-flash | $0.05 |
Audio & speech
Text to speech
Billed per 10,000 characters of input text.
| Model | Price per 10K chars |
|---|---|
| cosyvoice-v3-plus | $0.26 |
| cosyvoice-v3-flash | $0.13 |
| qwen3-tts-flash | $0.10 |
Speech to text
Billed per second of audio input.
| Model | Price per second |
|---|---|
| fun-asr | $0.000035 |
| fun-asr-realtime | $0.00009 |
| qwen3-asr-flash | $0.000035 |
Speech to speech
Qwen-Omni is a multimodal model that handles text, audio, and image/video in a single call. All modality prices are listed in the table below.
| Input type | Conversion rate |
|---|---|
| Text | Standard tokenizer |
| Audio input | ≈ 7 tokens/sec (Qwen3.5-Omni) or 12.5 tokens/sec (Qwen3-Omni-Flash) or 25 tokens/sec (Qwen-Omni-Turbo) |
| Audio output | ≈ 12.5 tokens/sec (Qwen3.5-Omni) or 12.5 tokens/sec (Qwen3-Omni-Flash) |
| Image/Video | See Understanding section above |
| Model | Text/Image/Video input | Audio input | Text output | Text + Audio output |
|---|---|---|---|---|
| qwen3.5-omni-plus | $1.4 | $11 | $8.3 | $44 |
| qwen3.5-omni-flash | $0.4 | $3 | $2.2 | $11.9 |
Embedding & reranking
Billed per million input tokens (output is not charged). Multimodal embedding models may charge different rates for image vs text input. Image/video token conversion for embedding models is handled internally — check the usage field in the API response for actual token counts.
| Model | Modality | Price per 1M tokens |
|---|---|---|
| text-embedding-v4 | Text | $0.07 |
| tongyi-embedding-vision-plus | All | $0.09 |
| tongyi-embedding-vision-flash | Image/Video | $0.03 |
| Text | $0.09 | |
| qwen3-rerank | Text | $0.10 |
Built-in tools
Some built-in tools incur per-call fees in addition to model token costs.
| Tool | Fee | Notes |
|---|---|---|
| Web Search | $10 / 1K calls | |
| Web Extractor | FREE | Limited time |
| Code Interpreter | FREE | Limited time |
| Image Search | $8 / 1K calls | Text-to-image and image-to-image |
Free quota
New users get free quota upon sign-up, typically valid for 90 days. Applies to real-time API calls only. Learn more →
Save on costs
- Batch API — 50% off for async workloads. Learn more →
- Context caching — Reuse long prompts at reduced cost. Learn more →
- Model selection — Match model tier to task complexity. Compare models →
Batch and cache discounts cannot be combined on the same request.
Learn more
- Model Marketplace — Complete pricing for all models
- Free quota — Eligibility and activation
- Cost optimization — Advanced strategies
- Token Plan — Credits-based pricing for AI coding tools
- Coding Plan — Fixed monthly pricing for AI coding tools
- Billing FAQ — Common questions
- Bill management — View usage and invoices