Voice cloning | Qwen Cloud

Clone a voice from 10-20 seconds of audio. The API returns a voice identifier instantly -- no training required. For an overview of how voice cloning works, model selection, and end-to-end examples, see Voice cloning guide.

Prerequisites

An API key configured as the DASHSCOPE_API_KEY environment variable
The latest DashScope SDK
An audio file that meets the audio requirements

API reference

All three endpoints share the same base URL and headers. Base URL

POST https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization

Common request headers

Header	Type	Required	Description
Authorization	string	Yes	`Bearer $DASHSCOPE_API_KEY`
Content-Type	string	Yes	`application/json`

Create voice

Upload audio to create a cloned voice.

Request body

The model parameter is always qwen-voice-enrollment. The target_model must match the speech synthesis model you use -- otherwise synthesis fails.

{
  "model": "qwen-voice-enrollment",
  "input": {
    "action": "create",
    "target_model": "qwen3-tts-vc-realtime-2026-01-15",
    "preferred_name": "guanyu",
    "audio": {
      "data": "https://xxx.wav"
    },
    "text": "Optional. Text matching the audio content.",
    "language": "Optional. Language code, e.g. zh."
  }
}

Request parameters

Parameter	Type	Default	Required	Description
model	string	-	Yes	Voice cloning model. Fixed as `qwen-voice-enrollment`.
action	string	-	Yes	Operation type. Fixed as `create`.
target_model	string	-	Yes	Speech synthesis model for the cloned voice. Supported: `qwen3-tts-vc-realtime-2026-01-15`, `qwen3-tts-vc-realtime-2025-11-27`, `qwen3-tts-vc-2026-01-22`. Must match the model in your synthesis calls.
preferred_name	string	-	Yes	Voice name (up to 16 characters: digits, letters, underscores). Appears in the generated voice name. Example: `guanyu` produces `qwen-tts-vc-guanyu-voice-20250812105009984-838b`.
audio.data	string	-	Yes	Audio for cloning. Two formats: Data URL -- `data:<mediatype>;base64,<data>` (`<mediatype>` = `audio/wav`, `audio/mpeg`, or `audio/mp4`). Keep encoded data under 10 MB. Audio URL -- Publicly accessible URL (no auth required).
text	string	-	No	Text matching the audio content. The server validates the match and returns `Audio.PreprocessError` if significantly different.
language	string	-	No	Audio language. Supported: `zh`, `en`, `de`, `it`, `pt`, `es`, `ja`, `ko`, `fr`, `ru`. Must match the audio if specified.

Show View sample code

Base64 encoding examplesPython:

import base64, pathlib

# Replace input.mp3 with your audio file path
file_path = pathlib.Path("input.mp3")
base64_str = base64.b64encode(file_path.read_bytes()).decode()
data_uri = f"data:audio/mpeg;base64,{base64_str}"

Java:

import java.nio.file.*;
import java.util.Base64;

public class Main {
  public static String toDataUrl(String filePath) throws Exception {
    byte[] bytes = Files.readAllBytes(Paths.get(filePath));
    String encoded = Base64.getEncoder().encodeToString(bytes);
    return "data:audio/mpeg;base64," + encoded;
  }

  public static void main(String[] args) throws Exception {
    System.out.println(toDataUrl("input.mp3"));
  }
}

Response

Show View response example

{
  "output": {
    "voice": "yourVoice",
    "target_model": "qwen3-tts-vc-realtime-2026-01-15"
  },
  "usage": {
    "count": 1
  },
  "request_id": "yourRequestId"
}

Parameter	Type	Description
voice	string	Generated voice name. Pass this as the `voice` parameter in synthesis calls.
target_model	string	Speech synthesis model bound to this voice.
request_id	string	Unique request identifier.
count	integer	Billed voice creation operations. Always `1` for create requests. Cost: count x $0.01.

Sample code

cURL
Python
Java

curl -X POST https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
  "model": "qwen-voice-enrollment",
  "input": {
    "action": "create",
    "target_model": "qwen3-tts-vc-realtime-2026-01-15",
    "preferred_name": "guanyu",
    "audio": {
      "data": "https://xxx.wav"
    }
  }
}'

import os
import requests
import base64, pathlib

target_model = "qwen3-tts-vc-realtime-2026-01-15"
preferred_name = "guanyu"
audio_mime_type = "audio/mpeg"

file_path = pathlib.Path("input.mp3")
base64_str = base64.b64encode(file_path.read_bytes()).decode()
data_uri = f"data:{audio_mime_type};base64,{base64_str}"

api_key = os.getenv("DASHSCOPE_API_KEY")
url = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization"

payload = {
  "model": "qwen-voice-enrollment", # Do not change this value
  "input": {
    "action": "create",
    "target_model": target_model,
    "preferred_name": preferred_name,
    "audio": {
      "data": data_uri
    }
  }
}

headers = {
  "Authorization": f"Bearer {api_key}",
  "Content-Type": "application/json"
}

# Send POST request
resp = requests.post(url, json=payload, headers=headers)

if resp.status_code == 200:
  data = resp.json()
  voice = data["output"]["voice"]
  print(f"Generated voice parameter: {voice}")
else:
  print("Request failed:", resp.status_code, resp.text)

import com.google.gson.Gson;
import com.google.gson.JsonObject;

import java.io.*;
import java.net.HttpURLConnection;
import java.net.URL;
import java.nio.file.*;
import java.util.Base64;

public class Main {
  private static final String TARGET_MODEL = "qwen3-tts-vc-realtime-2026-01-15";
  private static final String PREFERRED_NAME = "guanyu";
  private static final String AUDIO_FILE = "input.mp3";
  private static final String AUDIO_MIME_TYPE = "audio/mpeg";

  public static String toDataUrl(String filePath) throws Exception {
    byte[] bytes = Files.readAllBytes(Paths.get(filePath));
    String encoded = Base64.getEncoder().encodeToString(bytes);
    return "data:" + AUDIO_MIME_TYPE + ";base64," + encoded;
  }

  public static void main(String[] args) {
    String apiKey = System.getenv("DASHSCOPE_API_KEY");
    String apiUrl = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization";

    try {
      String jsonPayload =
          "{"
              + "\"model\": \"qwen-voice-enrollment\"," // Do not change this value
              + "\"input\": {"
              +     "\"action\": \"create\","
              +     "\"target_model\": \"" + TARGET_MODEL + "\","
              +     "\"preferred_name\": \"" + PREFERRED_NAME + "\","
              +     "\"audio\": {"
              +         "\"data\": \"" + toDataUrl(AUDIO_FILE) + "\""
              +     "}"
              + "}"
              + "}";

      HttpURLConnection con = (HttpURLConnection) new URL(apiUrl).openConnection();
      con.setRequestMethod("POST");
      con.setRequestProperty("Authorization", "Bearer " + apiKey);
      con.setRequestProperty("Content-Type", "application/json");
      con.setDoOutput(true);

      try (OutputStream os = con.getOutputStream()) {
        os.write(jsonPayload.getBytes("UTF-8"));
      }

      int status = con.getResponseCode();
      InputStream is = (status >= 200 && status < 300)
          ? con.getInputStream()
          : con.getErrorStream();

      StringBuilder response = new StringBuilder();
      try (BufferedReader br = new BufferedReader(new InputStreamReader(is, "UTF-8"))) {
        String line;
        while ((line = br.readLine()) != null) {
          response.append(line);
        }
      }

      System.out.println("HTTP status code: " + status);
      System.out.println("Response content: " + response.toString());

      if (status == 200) {
        Gson gson = new Gson();
        JsonObject jsonObj = gson.fromJson(response.toString(), JsonObject.class);
        String voice = jsonObj.getAsJsonObject("output").get("voice").getAsString();
        System.out.println("Generated voice parameter: " + voice);
      }

    } catch (Exception e) {
      e.printStackTrace();
    }
  }
}

List voices

List your cloned voices with pagination.

Request body

The model parameter is always qwen-voice-enrollment. Do not modify this value.

{
  "model": "qwen-voice-enrollment",
  "input": {
    "action": "list",
    "page_size": 2,
    "page_index": 0
  }
}

Request parameters

Parameter	Type	Default	Required	Description
model	string	-	Yes	Voice cloning model. Fixed as `qwen-voice-enrollment`.
action	string	-	Yes	Operation type. Fixed as `list`.
page_index	integer	0	No	Page number, starting from 0. Range: 0 -- 1000000.
page_size	integer	10	No	Results per page. Range: 0 -- 1000000.

Response

Show View response example

{
  "output": {
    "voice_list": [
      {
        "voice": "yourVoice1",
        "gmt_create": "2025-08-11 17:59:32",
        "target_model": "qwen3-tts-vc-realtime-2026-01-15"
      },
      {
        "voice": "yourVoice2",
        "gmt_create": "2025-08-11 17:38:10",
        "target_model": "qwen3-tts-vc-realtime-2026-01-15"
      }
    ]
  },
  "usage": {
    "count": 0
  },
  "request_id": "yourRequestId"
}

Parameter	Type	Description
voice	string	Voice name. Pass this as the `voice` parameter in synthesis calls.
gmt_create	string	Voice creation timestamp.
target_model	string	Speech synthesis model bound to this voice.
request_id	string	Unique request identifier.
count	integer	Always `0`. Listing voices is free.

Sample code

cURL
Python
Java

curl --location --request POST 'https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization' \
--header 'Authorization: Bearer $DASHSCOPE_API_KEY' \
--header 'Content-Type: application/json' \
--data '{
  "model": "qwen-voice-enrollment",
  "input": {
    "action": "list",
    "page_size": 10,
    "page_index": 0
  }
}'

import os
import requests

api_key = os.getenv("DASHSCOPE_API_KEY")
url = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization"

payload = {
  "model": "qwen-voice-enrollment", # Do not change this value
  "input": {
    "action": "list",
    "page_size": 10,
    "page_index": 0
  }
}

headers = {
  "Authorization": f"Bearer {api_key}",
  "Content-Type": "application/json"
}

response = requests.post(url, json=payload, headers=headers)

print("HTTP status code:", response.status_code)

if response.status_code == 200:
  data = response.json()
  voice_list = data["output"]["voice_list"]

  print("List of voices found:")
  for item in voice_list:
    print(f"- Voice: {item['voice']}  Creation time: {item['gmt_create']}  Model: {item['target_model']}")
else:
  print("Request failed:", response.text)

import com.google.gson.Gson;
import com.google.gson.JsonArray;
import com.google.gson.JsonObject;

import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.net.HttpURLConnection;
import java.net.URL;

public class Main {
  public static void main(String[] args) {
    String apiKey = System.getenv("DASHSCOPE_API_KEY");
    String apiUrl = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization";

    String jsonPayload =
        "{"
            + "\"model\": \"qwen-voice-enrollment\"," // Do not change this value
            + "\"input\": {"
            +     "\"action\": \"list\","
            +     "\"page_size\": 10,"
            +     "\"page_index\": 0"
            + "}"
            + "}";

    try {
      HttpURLConnection con = (HttpURLConnection) new URL(apiUrl).openConnection();
      con.setRequestMethod("POST");
      con.setRequestProperty("Authorization", "Bearer " + apiKey);
      con.setRequestProperty("Content-Type", "application/json");
      con.setDoOutput(true);

      try (OutputStream os = con.getOutputStream()) {
        os.write(jsonPayload.getBytes("UTF-8"));
      }

      int status = con.getResponseCode();
      BufferedReader br = new BufferedReader(new InputStreamReader(
          status >= 200 && status < 300 ? con.getInputStream() : con.getErrorStream(), "UTF-8"));

      StringBuilder response = new StringBuilder();
      String line;
      while ((line = br.readLine()) != null) {
        response.append(line);
      }
      br.close();

      System.out.println("HTTP status code: " + status);
      System.out.println("Response JSON: " + response.toString());

      if (status == 200) {
        Gson gson = new Gson();
        JsonObject jsonObj = gson.fromJson(response.toString(), JsonObject.class);
        JsonArray voiceList = jsonObj.getAsJsonObject("output").getAsJsonArray("voice_list");

        System.out.println("\n List of voices found:");
        for (int i = 0; i < voiceList.size(); i++) {
          JsonObject voiceItem = voiceList.get(i).getAsJsonObject();
          String voice = voiceItem.get("voice").getAsString();
          String gmtCreate = voiceItem.get("gmt_create").getAsString();
          String targetModel = voiceItem.get("target_model").getAsString();

          System.out.printf("- Voice: %s  Creation time: %s  Model: %s\n",
              voice, gmtCreate, targetModel);
        }
      }

    } catch (Exception e) {
      e.printStackTrace();
    }
  }
}

Delete a voice

Delete a voice to free up quota.

Request body

The model parameter is always qwen-voice-enrollment. Do not modify this value.

{
  "model": "qwen-voice-enrollment",
  "input": {
    "action": "delete",
    "voice": "yourVoice"
  }
}

Request parameters

Parameter	Type	Default	Required	Description
model	string	-	Yes	Voice cloning model. Fixed as `qwen-voice-enrollment`.
action	string	-	Yes	Operation type. Fixed as `delete`.
voice	string	-	Yes	The voice to delete.

Response

Show View response example

{
  "usage": {
    "count": 0
  },
  "request_id": "yourRequestId"
}

Parameter	Type	Description
request_id	string	Unique request identifier.
count	integer	Always `0`. Deleting voices is free.

Sample code

cURL
Python
Java

curl --location --request POST 'https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization' \
--header 'Authorization: Bearer $DASHSCOPE_API_KEY' \
--header 'Content-Type: application/json' \
--data '{
  "model": "qwen-voice-enrollment",
  "input": {
    "action": "delete",
    "voice": "yourVoice"
  }
}'

import os
import requests

api_key = os.getenv("DASHSCOPE_API_KEY")
url = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization"

voice_to_delete = "yourVoice"  # Voice to delete (replace with actual value)

payload = {
  "model": "qwen-voice-enrollment", # Do not change this value
  "input": {
    "action": "delete",
    "voice": voice_to_delete
  }
}

headers = {
  "Authorization": f"Bearer {api_key}",
  "Content-Type": "application/json"
}

response = requests.post(url, json=payload, headers=headers)

print("HTTP status code:", response.status_code)

if response.status_code == 200:
  data = response.json()
  request_id = data["request_id"]

  print(f"Deletion successful")
  print(f"Request ID: {request_id}")
else:
  print("Request failed:", response.text)

import com.google.gson.Gson;
import com.google.gson.JsonObject;

import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.net.HttpURLConnection;
import java.net.URL;

public class Main {
  public static void main(String[] args) {
    String apiKey = System.getenv("DASHSCOPE_API_KEY");
    String apiUrl = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization";
    String voiceToDelete = "yourVoice"; // Voice to delete (replace with actual value)

    String jsonPayload =
        "{"
            + "\"model\": \"qwen-voice-enrollment\"," // Do not change this value
            + "\"input\": {"
            +     "\"action\": \"delete\","
            +     "\"voice\": \"" + voiceToDelete + "\""
            + "}"
            + "}";

    try {
      HttpURLConnection con = (HttpURLConnection) new URL(apiUrl).openConnection();
      con.setRequestMethod("POST");
      con.setRequestProperty("Authorization", "Bearer " + apiKey);
      con.setRequestProperty("Content-Type", "application/json");
      con.setDoOutput(true);

      try (OutputStream os = con.getOutputStream()) {
        os.write(jsonPayload.getBytes("UTF-8"));
      }

      int status = con.getResponseCode();
      BufferedReader br = new BufferedReader(new InputStreamReader(
          status >= 200 && status < 300 ? con.getInputStream() : con.getErrorStream(), "UTF-8"));

      StringBuilder response = new StringBuilder();
      String line;
      while ((line = br.readLine()) != null) {
        response.append(line);
      }
      br.close();

      System.out.println("HTTP status code: " + status);
      System.out.println("Response JSON: " + response.toString());

      if (status == 200) {
        Gson gson = new Gson();
        JsonObject jsonObj = gson.fromJson(response.toString(), JsonObject.class);
        String requestId = jsonObj.get("request_id").getAsString();

        System.out.println("Deletion successful");
        System.out.println("Request ID: " + requestId);
      }

    } catch (Exception e) {
      e.printStackTrace();
    }
  }
}

Speech synthesis

To use cloned voices for synthesis, see the end-to-end examples or the full docs:

Bidirectional streaming: Realtime streaming TTS
Non-streaming / unidirectional streaming: Speech synthesis - Qwen

Voice quota and retention

Account limit: 1,000 voices per account. Call List voices to check your count.
Automatic cleanup: Voices unused for over one year are automatically deleted.

Copyright and legality

You are responsible for the ownership and legal rights to any voice you provide. Read the Terms of Service before using this API.

​Prerequisites

​API reference

​Create voice

​Request body

​Request parameters

​Response

​Sample code

​List voices

​Request body

​Request parameters

​Response

​Sample code

​Delete a voice

​Request body

​Request parameters

​Response

​Sample code

​Speech synthesis

​Voice quota and retention

​Copyright and legality

Prerequisites

API reference

Create voice

Request body

Request parameters

Response

Sample code

List voices

Request body

Request parameters

Response

Sample code

Delete a voice

Request body

Request parameters

Response

Sample code

Speech synthesis

Voice quota and retention

Copyright and legality