Skip to main content
Text-to-Speech

Voice design

Create custom voices from text descriptions for use with Qwen TTS models.

Voice design generates custom voices from text descriptions. After creating a voice, use the returned voice name with Qwen TTS or Realtime streaming TTS.
The target_model in voice design must match the model in synthesis. Mismatched models cause failures.

How it works

  1. Write a voice description (voice_prompt) and preview text (preview_text).
  2. Send a Create voice request with your target_model.
  3. The API returns a voice name and Base64-encoded preview audio. Decode the Base64 string to get the audio file (WAV format).
  4. Listen to the preview. If satisfied, use the voice name for synthesis. Otherwise, create a new voice.

Supported models

Voice design uses two models: a design model and a target synthesis model.
ModelValueUse with
Voice design modelqwen-voice-designAll voice design operations (fixed value)
Real-time synthesis targetqwen3-tts-vd-realtime-2026-01-15Realtime streaming TTS
Real-time synthesis target (earlier version)qwen3-tts-vd-realtime-2025-12-16Realtime streaming TTS
Non-real-time synthesis targetqwen3-tts-vd-2026-01-26Qwen TTS
Voice design models (qwen3-tts-vd-*) only support custom-designed voices. They do not support system voices (Chelsie, Serena, Ethan, Cherry).

Supported languages

CodeLanguage
zhChinese
enEnglish
deGerman
itItalian
ptPortuguese
esSpanish
jaJapanese
koKorean
frFrench
ruRussian
voice_prompt supports Chinese and English only. The language parameter must match the preview_text language.

Write effective voice descriptions

A voice description (voice_prompt) tells the model what voice to generate. Combine gender, age, tone, and use case to define a distinctive voice.

Constraints

  • Max length: 2,048 characters.
  • Languages: Chinese and English only.

Description dimensions

DimensionExamples
GenderMale, female, neutral
AgeChild (5--12), teenager (13--18), young adult (19--35), middle-aged (36--55), elderly (55+)
PitchHigh, medium, low, high-pitched, low-pitched
PaceFast, medium, slow, fast-paced, slow-paced
EmotionCheerful, calm, gentle, serious, lively, composed, soothing
CharacteristicsMagnetic, crisp, hoarse, mellow, sweet, rich, powerful
Use caseNews broadcast, ad voice-over, audiobook, animation character, voice assistant, documentary narration

Tips

  1. Be specific. Use concrete qualities like "deep," "crisp," or "fast-paced." Avoid vague terms like "nice" or "normal."
  2. Use multiple dimensions. Combine gender, age, emotion, and use case. "Female voice" alone is too broad.
  3. Be objective. Focus on physical and perceptual features. Write "high-pitched and energetic" instead of "my favorite voice."
  4. Be original. Describe voice qualities directly. Celebrity imitation is not supported and involves copyright risks.
  5. Be concise. Every word should serve a purpose. Avoid synonyms and meaningless intensifiers.

Examples

Good descriptions:
  • "A young, lively female voice with a fast pace and noticeable upward inflection, suitable for fashion product introductions."
  • "A calm, middle-aged male voice with a slow pace and deep, magnetic tone, suitable for news or documentary narration."
  • "A cute child's voice, around 8 years old, with a slightly childish tone, suitable for animation character voice-overs."
Ineffective descriptions:
DescriptionIssueImprovement
"A nice voice"Too vague"A young female voice with a clear vocal line and gentle tone."
"A voice like a certain celebrity"Celebrity imitation not supported"A mature, magnetic male voice with a calm pace."
"A very, very, very nice female voice"Redundant repetition"A female voice, 20--24 years old, with a light tone and sweet quality."

Error codes

If a call fails, see Error messages. Common voice design errors:
HTTP statusError codeCauseResolution
400BadRequest.VoiceNotFoundThe specified voice does not exist (in voice design or synthesis operations)Verify the voice name with List voices or Query a voice. If the voice does not exist, create a new voice with Create voice.

Next steps