Choose a model for speech synthesis, voice cloning, and voice design.
Two questions narrow the field: do you need a custom voice or will a built-in voice work, and do you need real-time streaming?
Pick a voice from the library and start synthesizing immediately.
Need a voice that doesn't exist in the library?
Three approaches, ranked by flexibility:
Built-in or custom voice?
Built-in voices
Pick a voice from the library and start synthesizing immediately.
- CosyVoice — rich voice library, high quality, no setup beyond picking a voice
- Qwen3-TTS — low-latency streaming; add
-instructfor natural-language control over speed, emotion, and style
Custom voice
Need a voice that doesn't exist in the library?
- Voice Cloning — reproduce a specific person's voice from audio samples. Use when you have a target voice to match.
- Voice Design — create a new voice from a text description (e.g., "a warm, low-pitched female voice"). Use when you want a brand voice without audio samples.
Controlling how the voice sounds
Three approaches, ranked by flexibility:
-
Instruction control (
qwen3-tts-instruct-flash,qwen3-tts-instruct-flash-realtime) — Describe the desired delivery in natural language. Control speed, emotion, and style per request. Most flexible. -
Voice design (
qwen3-tts-vd-*) — Generate a custom voice from a text description. Good for creating a brand voice without audio samples. -
Voice cloning (
qwen3-tts-vc-*) — Reproduce an existing voice from audio samples. Best when you need to match a specific person's voice.
Recommended models
| Model | Family | Streaming | Custom voice | Instruction control |
|---|---|---|---|---|
cosyvoice-v3-plus | CosyVoice | ✓ | — | — |
qwen3-tts-flash | Qwen3-TTS | ✓ | — | — |
qwen3-tts-flash-realtime | Qwen3-TTS | ✓ | — | — |
qwen3-tts-instruct-flash | Qwen3-TTS | ✓ | — | ✓ |
qwen3-tts-vc-realtime-2026-01-15 | Voice Cloning | ✓ | ✓ | — |
qwen3-tts-vd-realtime-2026-01-15 | Voice Design | ✓ | ✓ | — |
All models
CosyVoice
CosyVoice
| Model | Streaming | Custom voice | Instruction control |
|---|---|---|---|
cosyvoice-v3-plus | ✓ | — | — |
cosyvoice-v3-flash | ✓ | — | — |
Qwen3-TTS
Qwen3-TTS
| Model | Streaming | Custom voice | Instruction control |
|---|---|---|---|
qwen3-tts-flash | ✓ | — | — |
qwen3-tts-flash-realtime | ✓ | — | — |
qwen3-tts-instruct-flash | ✓ | — | ✓ |
qwen3-tts-instruct-flash-realtime | ✓ | — | ✓ |
Voice Cloning & Design
Voice Cloning & Design
| Model | Streaming | Custom voice | Instruction control |
|---|---|---|---|
qwen3-tts-vc-2026-01-22 | ✗ | ✓ | — |
qwen3-tts-vc-realtime-2026-01-15 | ✓ | ✓ | — |
qwen3-tts-vd-2026-01-26 | ✗ | ✓ | — |
qwen3-tts-vd-realtime-2026-01-15 | ✓ | ✓ | — |
Legacy
Legacy
Previous generation models. We recommend the latest versions above for new projects.
| Model | Family | Streaming | Custom voice | Instruction control |
|---|---|---|---|---|
qwen3-tts-flash-2025-11-27 | Qwen3-TTS | ✓ | — | — |
qwen3-tts-flash-2025-09-18 | Qwen3-TTS | ✓ | — | — |
qwen3-tts-flash-realtime-2025-11-27 | Qwen3-TTS | ✓ | — | — |
qwen3-tts-flash-realtime-2025-09-18 | Qwen3-TTS | ✓ | — | — |
qwen3-tts-instruct-flash-2026-01-26 | Qwen3-TTS | ✓ | — | ✓ |
qwen3-tts-instruct-flash-realtime-2026-01-22 | Qwen3-TTS | ✓ | — | ✓ |
qwen3-tts-vc-realtime-2025-11-27 | Voice Cloning | ✓ | ✓ | — |
qwen3-tts-vd-realtime-2025-12-16 | Voice Design | ✓ | ✓ | — |
Learn more
Text-to-speech guide
Learn how to use TTS models via API.
Real-time streaming guide
Use real-time TTS models via WebSocket.
CosyVoice voices
Browse CosyVoice voices and samples.
Qwen-TTS voices
Browse Qwen-TTS voices for non-streaming models.
Qwen-TTS-Realtime voices
Browse Qwen-TTS-Realtime voices for streaming models.
Voice cloning
Clone a voice from audio samples.