Clone a voice from 10-20 seconds of audio. The API returns a voice identifier instantly -- no training required.
For an overview of how voice cloning works, model selection, and end-to-end examples, see Voice cloning guide .
Prerequisites
API reference
All three endpoints share the same base URL and headers.
Base URL
POST https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization
Common request headers
Header Type Required Description Authorization string Yes Bearer $DASHSCOPE_API_KEYContent-Type string Yes application/json
Create voice
Upload audio to create a cloned voice.
Request body
The model parameter is always qwen-voice-enrollment. The target_model must match the speech synthesis model you use -- otherwise synthesis fails.
{
"model" : "qwen-voice-enrollment" ,
"input" : {
"action" : "create" ,
"target_model" : "qwen3-tts-vc-realtime-2026-01-15" ,
"preferred_name" : "guanyu" ,
"audio" : {
"data" : "https://xxx.wav"
},
"text" : "Optional. Text matching the audio content." ,
"language" : "Optional. Language code, e.g. zh."
}
}
Request parameters
Parameter Type Default Required Description model string - Yes Voice cloning model. Fixed as qwen-voice-enrollment. action string - Yes Operation type. Fixed as create. target_model string - Yes Speech synthesis model for the cloned voice. Supported: qwen3-tts-vc-realtime-2026-01-15, qwen3-tts-vc-realtime-2025-11-27, qwen3-tts-vc-2026-01-22. Must match the model in your synthesis calls. preferred_name string - Yes Voice name (up to 16 characters: digits, letters, underscores). Appears in the generated voice name. Example: guanyu produces qwen-tts-vc-guanyu-voice-20250812105009984-838b. audio.data string - Yes Audio for cloning. Two formats: Data URL -- data:<mediatype>;base64,<data> (<mediatype> = audio/wav, audio/mpeg, or audio/mp4). Keep encoded data under 10 MB. Audio URL -- Publicly accessible URL (no auth required). text string - No Text matching the audio content. The server validates the match and returns Audio.PreprocessError if significantly different. language string - No Audio language. Supported: zh, en, de, it, pt, es, ja, ko, fr, ru. Must match the audio if specified.
Base64 encoding examples Python: import base64, pathlib
# Replace input.mp3 with your audio file path
file_path = pathlib.Path( "input.mp3" )
base64_str = base64.b64encode(file_path.read_bytes()).decode()
data_uri = f "data:audio/mpeg;base64, { base64_str } "
Java: import java.nio.file. * ;
import java.util.Base64;
public class Main {
public static String toDataUrl ( String filePath ) throws Exception {
byte [] bytes = Files . readAllBytes ( Paths . get (filePath));
String encoded = Base64 . getEncoder (). encodeToString (bytes);
return "data:audio/mpeg;base64," + encoded;
}
public static void main ( String [] args ) throws Exception {
System . out . println ( toDataUrl ( "input.mp3" ));
}
}
Response
Show View response example
{
"output" : {
"voice" : "yourVoice" ,
"target_model" : "qwen3-tts-vc-realtime-2026-01-15"
},
"usage" : {
"count" : 1
},
"request_id" : "yourRequestId"
}
Parameter Type Description voice string Generated voice name. Pass this as the voice parameter in synthesis calls. target_model string Speech synthesis model bound to this voice. request_id string Unique request identifier. count integer Billed voice creation operations. Always 1 for create requests. Cost: count x $0.01.
Sample code
curl -X POST https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization \
-H "Authorization: Bearer $DASHSCOPE_API_KEY " \
-H "Content-Type: application/json" \
-d '{
"model": "qwen-voice-enrollment",
"input": {
"action": "create",
"target_model": "qwen3-tts-vc-realtime-2026-01-15",
"preferred_name": "guanyu",
"audio": {
"data": "https://xxx.wav"
}
}
}'
import os
import requests
import base64, pathlib
target_model = "qwen3-tts-vc-realtime-2026-01-15"
preferred_name = "guanyu"
audio_mime_type = "audio/mpeg"
file_path = pathlib.Path( "input.mp3" )
base64_str = base64.b64encode(file_path.read_bytes()).decode()
data_uri = f "data: { audio_mime_type } ;base64, { base64_str } "
api_key = os.getenv( "DASHSCOPE_API_KEY" )
url = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization"
payload = {
"model" : "qwen-voice-enrollment" , # Do not change this value
"input" : {
"action" : "create" ,
"target_model" : target_model,
"preferred_name" : preferred_name,
"audio" : {
"data" : data_uri
}
}
}
headers = {
"Authorization" : f "Bearer { api_key } " ,
"Content-Type" : "application/json"
}
# Send POST request
resp = requests.post(url, json = payload, headers = headers)
if resp.status_code == 200 :
data = resp.json()
voice = data[ "output" ][ "voice" ]
print ( f "Generated voice parameter: { voice } " )
else :
print ( "Request failed:" , resp.status_code, resp.text)
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import java.io. * ;
import java.net.HttpURLConnection;
import java.net.URL;
import java.nio.file. * ;
import java.util.Base64;
public class Main {
private static final String TARGET_MODEL = "qwen3-tts-vc-realtime-2026-01-15" ;
private static final String PREFERRED_NAME = "guanyu" ;
private static final String AUDIO_FILE = "input.mp3" ;
private static final String AUDIO_MIME_TYPE = "audio/mpeg" ;
public static String toDataUrl ( String filePath ) throws Exception {
byte [] bytes = Files . readAllBytes ( Paths . get (filePath));
String encoded = Base64 . getEncoder (). encodeToString (bytes);
return "data:" + AUDIO_MIME_TYPE + ";base64," + encoded;
}
public static void main ( String [] args ) {
String apiKey = System . getenv ( "DASHSCOPE_API_KEY" );
String apiUrl = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization" ;
try {
String jsonPayload =
"{"
+ " \" model \" : \" qwen-voice-enrollment \" ," // Do not change this value
+ " \" input \" : {"
+ " \" action \" : \" create \" ,"
+ " \" target_model \" : \" " + TARGET_MODEL + " \" ,"
+ " \" preferred_name \" : \" " + PREFERRED_NAME + " \" ,"
+ " \" audio \" : {"
+ " \" data \" : \" " + toDataUrl (AUDIO_FILE) + " \" "
+ "}"
+ "}"
+ "}" ;
HttpURLConnection con = (HttpURLConnection) new URL (apiUrl). openConnection ();
con . setRequestMethod ( "POST" );
con . setRequestProperty ( "Authorization" , "Bearer " + apiKey);
con . setRequestProperty ( "Content-Type" , "application/json" );
con . setDoOutput ( true );
try ( OutputStream os = con . getOutputStream ()) {
os . write ( jsonPayload . getBytes ( "UTF-8" ));
}
int status = con . getResponseCode ();
InputStream is = (status >= 200 && status < 300 )
? con . getInputStream ()
: con . getErrorStream ();
StringBuilder response = new StringBuilder ();
try ( BufferedReader br = new BufferedReader ( new InputStreamReader (is, "UTF-8" ))) {
String line ;
while ((line = br . readLine ()) != null ) {
response . append (line);
}
}
System . out . println ( "HTTP status code: " + status);
System . out . println ( "Response content: " + response . toString ());
if (status == 200 ) {
Gson gson = new Gson ();
JsonObject jsonObj = gson . fromJson ( response . toString (), JsonObject . class );
String voice = jsonObj . getAsJsonObject ( "output" ). get ( "voice" ). getAsString ();
System . out . println ( "Generated voice parameter: " + voice);
}
} catch ( Exception e ) {
e . printStackTrace ();
}
}
}
List voices
List your cloned voices with pagination.
Request body
The model parameter is always qwen-voice-enrollment. Do not modify this value.
{
"model" : "qwen-voice-enrollment" ,
"input" : {
"action" : "list" ,
"page_size" : 2 ,
"page_index" : 0
}
}
Request parameters
Parameter Type Default Required Description model string - Yes Voice cloning model. Fixed as qwen-voice-enrollment. action string - Yes Operation type. Fixed as list. page_index integer 0 No Page number, starting from 0. Range: 0 -- 1000000. page_size integer 10 No Results per page. Range: 0 -- 1000000.
Response
Show View response example
{
"output" : {
"voice_list" : [
{
"voice" : "yourVoice1" ,
"gmt_create" : "2025-08-11 17:59:32" ,
"target_model" : "qwen3-tts-vc-realtime-2026-01-15"
},
{
"voice" : "yourVoice2" ,
"gmt_create" : "2025-08-11 17:38:10" ,
"target_model" : "qwen3-tts-vc-realtime-2026-01-15"
}
]
},
"usage" : {
"count" : 0
},
"request_id" : "yourRequestId"
}
Parameter Type Description voice string Voice name. Pass this as the voice parameter in synthesis calls. gmt_create string Voice creation timestamp. target_model string Speech synthesis model bound to this voice. request_id string Unique request identifier. count integer Always 0. Listing voices is free.
Sample code
curl --location --request POST 'https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization' \
--header 'Authorization: Bearer $DASHSCOPE_API_KEY' \
--header 'Content-Type: application/json' \
--data '{
"model": "qwen-voice-enrollment",
"input": {
"action": "list",
"page_size": 10,
"page_index": 0
}
}'
import os
import requests
api_key = os.getenv( "DASHSCOPE_API_KEY" )
url = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization"
payload = {
"model" : "qwen-voice-enrollment" , # Do not change this value
"input" : {
"action" : "list" ,
"page_size" : 10 ,
"page_index" : 0
}
}
headers = {
"Authorization" : f "Bearer { api_key } " ,
"Content-Type" : "application/json"
}
response = requests.post(url, json = payload, headers = headers)
print ( "HTTP status code:" , response.status_code)
if response.status_code == 200 :
data = response.json()
voice_list = data[ "output" ][ "voice_list" ]
print ( "List of voices found:" )
for item in voice_list:
print ( f "- Voice: { item[ 'voice' ] } Creation time: { item[ 'gmt_create' ] } Model: { item[ 'target_model' ] } " )
else :
print ( "Request failed:" , response.text)
import com.google.gson.Gson;
import com.google.gson.JsonArray;
import com.google.gson.JsonObject;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.net.HttpURLConnection;
import java.net.URL;
public class Main {
public static void main ( String [] args ) {
String apiKey = System . getenv ( "DASHSCOPE_API_KEY" );
String apiUrl = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization" ;
String jsonPayload =
"{"
+ " \" model \" : \" qwen-voice-enrollment \" ," // Do not change this value
+ " \" input \" : {"
+ " \" action \" : \" list \" ,"
+ " \" page_size \" : 10,"
+ " \" page_index \" : 0"
+ "}"
+ "}" ;
try {
HttpURLConnection con = (HttpURLConnection) new URL (apiUrl). openConnection ();
con . setRequestMethod ( "POST" );
con . setRequestProperty ( "Authorization" , "Bearer " + apiKey);
con . setRequestProperty ( "Content-Type" , "application/json" );
con . setDoOutput ( true );
try ( OutputStream os = con . getOutputStream ()) {
os . write ( jsonPayload . getBytes ( "UTF-8" ));
}
int status = con . getResponseCode ();
BufferedReader br = new BufferedReader ( new InputStreamReader (
status >= 200 && status < 300 ? con . getInputStream () : con . getErrorStream (), "UTF-8" ));
StringBuilder response = new StringBuilder ();
String line ;
while ((line = br . readLine ()) != null ) {
response . append (line);
}
br . close ();
System . out . println ( "HTTP status code: " + status);
System . out . println ( "Response JSON: " + response . toString ());
if (status == 200 ) {
Gson gson = new Gson ();
JsonObject jsonObj = gson . fromJson ( response . toString (), JsonObject . class );
JsonArray voiceList = jsonObj . getAsJsonObject ( "output" ). getAsJsonArray ( "voice_list" );
System . out . println ( " \n List of voices found:" );
for ( int i = 0 ; i < voiceList . size (); i ++ ) {
JsonObject voiceItem = voiceList . get (i). getAsJsonObject ();
String voice = voiceItem . get ( "voice" ). getAsString ();
String gmtCreate = voiceItem . get ( "gmt_create" ). getAsString ();
String targetModel = voiceItem . get ( "target_model" ). getAsString ();
System . out . printf ( "- Voice: %s Creation time: %s Model: %s \n " ,
voice, gmtCreate, targetModel);
}
}
} catch ( Exception e ) {
e . printStackTrace ();
}
}
}
Delete a voice
Delete a voice to free up quota.
Request body
The model parameter is always qwen-voice-enrollment. Do not modify this value.
{
"model" : "qwen-voice-enrollment" ,
"input" : {
"action" : "delete" ,
"voice" : "yourVoice"
}
}
Request parameters
Parameter Type Default Required Description model string - Yes Voice cloning model. Fixed as qwen-voice-enrollment. action string - Yes Operation type. Fixed as delete. voice string - Yes The voice to delete.
Response
Show View response example
{
"usage" : {
"count" : 0
},
"request_id" : "yourRequestId"
}
Parameter Type Description request_id string Unique request identifier. count integer Always 0. Deleting voices is free.
Sample code
curl --location --request POST 'https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization' \
--header 'Authorization: Bearer $DASHSCOPE_API_KEY' \
--header 'Content-Type: application/json' \
--data '{
"model": "qwen-voice-enrollment",
"input": {
"action": "delete",
"voice": "yourVoice"
}
}'
import os
import requests
api_key = os.getenv( "DASHSCOPE_API_KEY" )
url = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization"
voice_to_delete = "yourVoice" # Voice to delete (replace with actual value)
payload = {
"model" : "qwen-voice-enrollment" , # Do not change this value
"input" : {
"action" : "delete" ,
"voice" : voice_to_delete
}
}
headers = {
"Authorization" : f "Bearer { api_key } " ,
"Content-Type" : "application/json"
}
response = requests.post(url, json = payload, headers = headers)
print ( "HTTP status code:" , response.status_code)
if response.status_code == 200 :
data = response.json()
request_id = data[ "request_id" ]
print ( f "Deletion successful" )
print ( f "Request ID: { request_id } " )
else :
print ( "Request failed:" , response.text)
import com.google.gson.Gson;
import com.google.gson.JsonObject;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.net.HttpURLConnection;
import java.net.URL;
public class Main {
public static void main ( String [] args ) {
String apiKey = System . getenv ( "DASHSCOPE_API_KEY" );
String apiUrl = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization" ;
String voiceToDelete = "yourVoice" ; // Voice to delete (replace with actual value)
String jsonPayload =
"{"
+ " \" model \" : \" qwen-voice-enrollment \" ," // Do not change this value
+ " \" input \" : {"
+ " \" action \" : \" delete \" ,"
+ " \" voice \" : \" " + voiceToDelete + " \" "
+ "}"
+ "}" ;
try {
HttpURLConnection con = (HttpURLConnection) new URL (apiUrl). openConnection ();
con . setRequestMethod ( "POST" );
con . setRequestProperty ( "Authorization" , "Bearer " + apiKey);
con . setRequestProperty ( "Content-Type" , "application/json" );
con . setDoOutput ( true );
try ( OutputStream os = con . getOutputStream ()) {
os . write ( jsonPayload . getBytes ( "UTF-8" ));
}
int status = con . getResponseCode ();
BufferedReader br = new BufferedReader ( new InputStreamReader (
status >= 200 && status < 300 ? con . getInputStream () : con . getErrorStream (), "UTF-8" ));
StringBuilder response = new StringBuilder ();
String line ;
while ((line = br . readLine ()) != null ) {
response . append (line);
}
br . close ();
System . out . println ( "HTTP status code: " + status);
System . out . println ( "Response JSON: " + response . toString ());
if (status == 200 ) {
Gson gson = new Gson ();
JsonObject jsonObj = gson . fromJson ( response . toString (), JsonObject . class );
String requestId = jsonObj . get ( "request_id" ). getAsString ();
System . out . println ( "Deletion successful" );
System . out . println ( "Request ID: " + requestId);
}
} catch ( Exception e ) {
e . printStackTrace ();
}
}
}
Speech synthesis
To use cloned voices for synthesis, see the end-to-end examples or the full docs:
Voice quota and retention
Account limit : 1,000 voices per account. Call List voices to check your count.
Automatic cleanup : Voices unused for over one year are automatically deleted.
Copyright and legality
You are responsible for the ownership and legal rights to any voice you provide. Read the Terms of Service before using this API.