Skip to main content
Voice design

Create a voice

Create a custom voice from a text description and return preview audio.

POST
/services/audio/tts/customization
cURL
curl --request POST \
  --url 'https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization' \
  --header 'Authorization: Bearer <YOUR_API_KEY>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "qwen-voice-design",
  "input": {
    "action": "create",
    "target_model": "qwen3-tts-vd-realtime-2026-01-15",
    "voice_prompt": "<string>",
    "preview_text": "<string>",
    "preferred_name": "<string>",
    "language": "zh"
  },
  "parameters": {
    "sample_rate": 8000,
    "response_format": "pcm"
  }
}
'
{
  "output": {
    "voice": "qwen-tts-vd-announcer-voice-20251201102800-a1b2",
    "preview_audio": {
      "data": "{base64_encoded_audio}",
      "sample_rate": 24000,
      "response_format": "wav"
    },
    "target_model": "qwen3-tts-vd-realtime-2026-01-15"
  },
  "usage": {
    "count": 1
  },
  "request_id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
}
model is the design model (always qwen-voice-design). target_model is the synthesis model that drives the created voice. The target_model must match the model in subsequent synthesis calls — mismatched models cause failures.

Authorizations

string
header
required

DashScope API key. Get one at API key.

Body

application/json
enum<string>
required

Voice design model. Fixed to qwen-voice-design.

qwen-voice-design
qwen-voice-design
object
required
object

Response

200-application/json
object
object
string

Request ID for troubleshooting.

xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx