Use the returned voice name with Qwen TTS or Realtime streaming TTS.
For an overview of how voice design works, supported models and languages, and tips for writing effective voice descriptions, see Voice design guide.
The target_model in voice design must match the model in synthesis. Mismatched models cause failures.
Prerequisites
- Get an API key and set it as an environment variable.
- Install the DashScope SDK (SDK examples only).
API reference
All operations use the same endpoint and authentication. Set the action parameter to choose the operation.
Common request details
Endpoint
POST https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization
Request headers
| Header | Type | Required | Description |
|---|
| Authorization | string | Yes | Bearer $DASHSCOPE_API_KEY |
| Content-Type | string | Yes | application/json |
Use the same account for voice design and synthesis.
Create a voice
Creates a custom voice from a text description and returns preview audio.
Request syntax
{
"model": "qwen-voice-design",
"input": {
"action": "create",
"target_model": "<target-synthesis-model>",
"voice_prompt": "<voice-description>",
"preview_text": "<text-for-preview-audio>",
"preferred_name": "<keyword-for-voice-name>",
"language": "<language-code>"
},
"parameters": {
"sample_rate": 24000,
"response_format": "wav"
}
}
model is the voice design model (always qwen-voice-design). target_model is the synthesis model that drives the created voice. Do not confuse them.
Request parameters
| Parameter | Type | Default | Required | Description |
|---|
| model | string | -- | Yes | Voice design model. Fixed to qwen-voice-design. |
| action | string | -- | Yes | Operation type. Fixed to create. |
| target_model | string | -- | Yes | Synthesis model for the voice. Must match the model in subsequent synthesis calls. Values: qwen3-tts-vd-realtime-2026-01-15, qwen3-tts-vd-realtime-2025-12-16 (real-time), qwen3-tts-vd-2026-01-26 (non-real-time). |
| voice_prompt | string | -- | Yes | Voice description. Max 2,048 characters. Chinese and English only. See Write effective voice descriptions. |
| preview_text | string | -- | Yes | Text for the preview audio. Max 1,024 characters. Must be in a supported language. |
| preferred_name | string | -- | No | Keyword for the voice name (alphanumeric and underscores, max 16 characters). Appears in the generated voice name. Example: announcer produces qwen-tts-vd-announcer-voice-20251201102800-a1b2. |
| language | string | zh | No | Language code for the generated voice. Must match the preview_text language. Valid values: zh, en, de, it, pt, es, ja, ko, fr, ru. |
| sample_rate | int | 24000 | No | Sample rate in Hz for the preview audio. Valid values: 8000, 16000, 24000, 48000. |
| response_format | string | wav | No | Audio format for the preview. Valid values: pcm, wav, mp3, opus. |
Response example
{
"output": {
"preview_audio": {
"data": "{base64_encoded_audio}",
"sample_rate": 24000,
"response_format": "wav"
},
"target_model": "qwen3-tts-vd-realtime-2026-01-15",
"voice": "qwen-tts-vd-announcer-voice-20251201102800-a1b2"
},
"usage": {
"count": 1
},
"request_id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
}
Response parameters
| Parameter | Type | Description |
|---|
| voice | string | Generated voice name. Pass this as the voice parameter in the synthesis API. |
| preview_audio.data | string | Base64-encoded preview audio. |
| preview_audio.sample_rate | int | Sample rate of the preview audio (matches request or defaults to 24000). |
| preview_audio.response_format | string | Format of the preview audio (matches request or defaults to wav). |
| target_model | string | Synthesis model bound to this voice. |
| usage.count | int | Voice creations billed. Always 1 for a successful creation ($0.2 per count). |
| request_id | string | Request ID for troubleshooting. |
List voices
Returns a paginated list of voices under your account.
Request syntax
{
"model": "qwen-voice-design",
"input": {
"action": "list",
"page_size": 10,
"page_index": 0
}
}
Request parameters
| Parameter | Type | Default | Required | Description |
|---|
| model | string | -- | Yes | Fixed to qwen-voice-design. |
| action | string | -- | Yes | Fixed to list. |
| page_index | integer | 0 | No | Page number. Range: 0--200. |
| page_size | integer | 10 | No | Results per page. Must be greater than 0. |
Response example
{
"output": {
"page_index": 0,
"page_size": 2,
"total_count": 26,
"voice_list": [
{
"gmt_create": "2025-12-10 17:04:54",
"gmt_modified": "2025-12-10 17:04:54",
"language": "zh",
"preview_text": "Dear listeners, hello everyone. Welcome to today's program.",
"target_model": "qwen3-tts-vd-realtime-2026-01-15",
"voice": "qwen-tts-vd-announcer-voice-20251210170454-a1b2",
"voice_prompt": "A composed middle-aged male announcer with a deep, rich and magnetic voice, a steady speaking speed and clear articulation, suitable for news broadcasting or documentary commentary."
}
]
},
"usage": {},
"request_id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
}
Response parameters
| Parameter | Type | Description |
|---|
| page_index | integer | Current page number. |
| page_size | integer | Entries per page. |
| total_count | integer | Total number of voices. |
| voice_list[].voice | string | Voice name. |
| voice_list[].target_model | string | Synthesis model bound to this voice. |
| voice_list[].language | string | Language code. |
| voice_list[].voice_prompt | string | Voice description. |
| voice_list[].preview_text | string | Preview text. |
| voice_list[].gmt_create | string | Creation timestamp. |
| voice_list[].gmt_modified | string | Last modified timestamp. |
| request_id | string | Request ID. |
Query a voice
Returns details about a specific voice.
Request syntax
{
"model": "qwen-voice-design",
"input": {
"action": "query",
"voice": "<voice-name>"
}
}
Request parameters
| Parameter | Type | Default | Required | Description |
|---|
| model | string | -- | Yes | Fixed to qwen-voice-design. |
| action | string | -- | Yes | Fixed to query. |
| voice | string | -- | Yes | Voice name to query. |
Response example (voice found)
{
"output": {
"gmt_create": "2025-12-10 14:54:09",
"gmt_modified": "2025-12-10 17:47:48",
"language": "zh",
"preview_text": "Dear listeners, hello everyone.",
"target_model": "qwen3-tts-vd-realtime-2026-01-15",
"voice": "qwen-tts-vd-announcer-voice-20251210145409-a1b2",
"voice_prompt": "A composed middle-aged male announcer with a deep, rich and magnetic voice, a steady speaking speed and clear articulation, suitable for news broadcasting or documentary commentary."
},
"usage": {},
"request_id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
}
Response example (voice not found)
If the voice does not exist, the API returns HTTP 400 with VoiceNotFound:
{
"request_id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"code": "VoiceNotFound",
"message": "Voice not found: qwen-tts-vd-announcer-voice-xxxx"
}
Response parameters
| Parameter | Type | Description |
|---|
| voice | string | Voice name. |
| target_model | string | Synthesis model bound to this voice. |
| language | string | Language code. |
| voice_prompt | string | Voice description. |
| preview_text | string | Preview text. |
| gmt_create | string | Creation time. |
| gmt_modified | string | Last modification time. |
| request_id | string | Request ID. |
Delete a voice
Deletes a voice and releases its quota.
Request syntax
{
"model": "qwen-voice-design",
"input": {
"action": "delete",
"voice": "<voice-name>"
}
}
Request parameters
| Parameter | Type | Default | Required | Description |
|---|
| model | string | -- | Yes | Fixed to qwen-voice-design. |
| action | string | -- | Yes | Fixed to delete. |
| voice | string | -- | Yes | Voice name to delete. |
Response example
{
"output": {
"voice": "qwen-tts-vd-announcer-voice-20251210145409-a1b2"
},
"usage": {},
"request_id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
}
Response parameters
| Parameter | Type | Description |
|---|
| voice | string | Deleted voice name. |
| request_id | string | Request ID. |
Sample code
Create a voice and preview
curl -X POST https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen-voice-design",
"input": {
"action": "create",
"target_model": "qwen3-tts-vd-realtime-2026-01-15",
"voice_prompt": "A composed middle-aged male announcer with a deep, rich and magnetic voice, a steady speaking speed and clear articulation, suitable for news broadcasting or documentary commentary.",
"preview_text": "Dear listeners, hello everyone. Welcome to the evening news.",
"preferred_name": "announcer",
"language": "en"
},
"parameters": {
"sample_rate": 24000,
"response_format": "wav"
}
}'
import requests
import base64
import os
def create_voice():
"""Create a custom voice and save the preview audio."""
# Load API key from environment variable
api_key = os.getenv("DASHSCOPE_API_KEY")
if not api_key:
print("Error: DASHSCOPE_API_KEY not set.")
return None, None
data = {
"model": "qwen-voice-design",
"input": {
"action": "create",
"target_model": "qwen3-tts-vd-realtime-2026-01-15",
"voice_prompt": "A composed middle-aged male announcer with a deep, rich "
"and magnetic voice, a steady speaking speed and clear "
"articulation, suitable for news broadcasting or "
"documentary commentary.",
"preview_text": "Dear listeners, hello everyone. Welcome to the evening news.",
"preferred_name": "announcer",
"language": "en"
},
"parameters": {
"sample_rate": 24000,
"response_format": "wav"
}
}
response = requests.post(
"https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization",
headers={
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
},
json=data,
timeout=60
)
if response.status_code == 200:
result = response.json()
voice_name = result["output"]["voice"]
audio_bytes = base64.b64decode(result["output"]["preview_audio"]["data"])
# Save preview audio
filename = f"{voice_name}_preview.wav"
with open(filename, "wb") as f:
f.write(audio_bytes)
print(f"Voice created: {voice_name}")
print(f"Preview saved to: {filename}")
return voice_name, filename
else:
print(f"Request failed ({response.status_code}): {response.text}")
return None, None
if __name__ == "__main__":
create_voice()
Add the Gson dependency to your project:<!-- https://mvnrepository.com/artifact/com.google.code.gson/gson -->
<dependency>
<groupId>com.google.code.gson</groupId>
<artifactId>gson</artifactId>
<version>2.13.1</version>
</dependency>
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.*;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.Base64;
public class Main {
public static void main(String[] args) {
new Main().createVoice();
}
public void createVoice() {
// Load API key from environment variable
String apiKey = System.getenv("DASHSCOPE_API_KEY");
String jsonBody = "{\n" +
" \"model\": \"qwen-voice-design\",\n" +
" \"input\": {\n" +
" \"action\": \"create\",\n" +
" \"target_model\": \"qwen3-tts-vd-realtime-2026-01-15\",\n" +
" \"voice_prompt\": \"A composed middle-aged male announcer with a deep, rich and magnetic voice, a steady speaking speed and clear articulation, suitable for news broadcasting or documentary commentary.\",\n" +
" \"preview_text\": \"Dear listeners, hello everyone. Welcome to the evening news.\",\n" +
" \"preferred_name\": \"announcer\",\n" +
" \"language\": \"en\"\n" +
" },\n" +
" \"parameters\": {\n" +
" \"sample_rate\": 24000,\n" +
" \"response_format\": \"wav\"\n" +
" }\n" +
"}";
HttpURLConnection connection = null;
try {
URL url = new URL("https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization");
connection = (HttpURLConnection) url.openConnection();
connection.setRequestMethod("POST");
connection.setRequestProperty("Authorization", "Bearer " + apiKey);
connection.setRequestProperty("Content-Type", "application/json");
connection.setDoOutput(true);
// Send request body
try (OutputStream os = connection.getOutputStream()) {
os.write(jsonBody.getBytes("UTF-8"));
os.flush();
}
int responseCode = connection.getResponseCode();
if (responseCode == HttpURLConnection.HTTP_OK) {
StringBuilder response = new StringBuilder();
try (BufferedReader br = new BufferedReader(
new InputStreamReader(connection.getInputStream(), "UTF-8"))) {
String line;
while ((line = br.readLine()) != null) {
response.append(line.trim());
}
}
// Parse response and save preview audio
JsonObject jsonResponse = JsonParser.parseString(response.toString()).getAsJsonObject();
JsonObject output = jsonResponse.getAsJsonObject("output");
String voiceName = output.get("voice").getAsString();
String base64Audio = output.getAsJsonObject("preview_audio").get("data").getAsString();
byte[] audioBytes = Base64.getDecoder().decode(base64Audio);
String filename = voiceName + "_preview.wav";
try (FileOutputStream fos = new FileOutputStream(filename)) {
fos.write(audioBytes);
}
System.out.println("Voice created: " + voiceName);
System.out.println("Preview saved to: " + filename);
} else {
StringBuilder error = new StringBuilder();
try (BufferedReader br = new BufferedReader(
new InputStreamReader(connection.getErrorStream(), "UTF-8"))) {
String line;
while ((line = br.readLine()) != null) {
error.append(line.trim());
}
}
System.out.println("Request failed (" + responseCode + "): " + error);
}
} catch (Exception e) {
System.err.println("Error: " + e.getMessage());
e.printStackTrace();
} finally {
if (connection != null) connection.disconnect();
}
}
}
Use a custom voice for synthesis
After creating a voice, pass the returned voice name to the synthesis API. The model must match the target_model from voice design.
Bidirectional streaming (real-time)
Use with qwen3-tts-vd-realtime-2026-01-15. See Realtime streaming TTS for details.
# pyaudio installation:
# macOS: brew install portaudio && pip install pyaudio
# Ubuntu: sudo apt-get install python3-pyaudio (or pip install pyaudio)
# CentOS: sudo yum install -y portaudio portaudio-devel && pip install pyaudio
# Windows: python -m pip install pyaudio
import pyaudio
import os
import base64
import threading
import time
import dashscope
from dashscope.audio.qwen_tts_realtime import QwenTtsRealtime, QwenTtsRealtimeCallback, AudioFormat
TEXT_TO_SYNTHESIZE = [
"Right? I really like this kind of supermarket,",
"especially during the New Year.",
"Going to the supermarket",
"just makes me feel",
"super, super happy!",
"I want to buy so many things!"
]
def init_dashscope_api_key():
"""Load the API key from environment variable."""
dashscope.api_key = os.getenv("DASHSCOPE_API_KEY")
class MyCallback(QwenTtsRealtimeCallback):
"""Callback for streaming TTS playback."""
def __init__(self):
self.complete_event = threading.Event()
self._player = pyaudio.PyAudio()
self._stream = self._player.open(
format=pyaudio.paInt16, channels=1, rate=24000, output=True
)
def on_open(self) -> None:
print("[TTS] Connection established")
def on_close(self, close_status_code, close_msg) -> None:
self._stream.stop_stream()
self._stream.close()
self._player.terminate()
print(f"[TTS] Connection closed, code={close_status_code}, msg={close_msg}")
def on_event(self, response: dict) -> None:
event_type = response.get("type", "")
if event_type == "session.created":
print(f'[TTS] Session started: {response["session"]["id"]}')
elif event_type == "response.audio.delta":
audio_data = base64.b64decode(response["delta"])
self._stream.write(audio_data)
elif event_type == "response.done":
print(f"[TTS] Response complete, ID: {qwen_tts_realtime.get_last_response_id()}")
elif event_type == "session.finished":
print("[TTS] Session finished")
self.complete_event.set()
def wait_for_finished(self):
self.complete_event.wait()
if __name__ == "__main__":
init_dashscope_api_key()
callback = MyCallback()
qwen_tts_realtime = QwenTtsRealtime(
model="qwen3-tts-vd-realtime-2026-01-15",
callback=callback,
url="wss://dashscope-intl.aliyuncs.com/api-ws/v1/realtime"
)
qwen_tts_realtime.connect()
qwen_tts_realtime.update_session(
voice="<your-voice-name>", # Replace with your voice design voice name
response_format=AudioFormat.PCM_24000HZ_MONO_16BIT,
mode="server_commit"
)
for text_chunk in TEXT_TO_SYNTHESIZE:
print(f"[Sending text]: {text_chunk}")
qwen_tts_realtime.append_text(text_chunk)
time.sleep(0.1)
qwen_tts_realtime.finish()
callback.wait_for_finished()
print(f"[Metric] session_id={qwen_tts_realtime.get_session_id()}, "
f"first_audio_delay={qwen_tts_realtime.get_first_audio_delay()}s")
import com.alibaba.dashscope.audio.qwen_tts_realtime.*;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.google.gson.JsonObject;
import javax.sound.sampled.*;
import java.util.Base64;
import java.util.Queue;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.atomic.AtomicReference;
import java.util.concurrent.ConcurrentLinkedQueue;
import java.util.concurrent.atomic.AtomicBoolean;
public class Main {
private static String[] textToSynthesize = {
"Right? I really like this kind of supermarket,",
"especially during the New Year.",
"Going to the supermarket",
"just makes me feel",
"super, super happy!",
"I want to buy so many things!"
};
// Real-time PCM audio player
public static class RealtimePcmPlayer {
private int sampleRate;
private SourceDataLine line;
private Thread decoderThread;
private Thread playerThread;
private AtomicBoolean stopped = new AtomicBoolean(false);
private Queue<String> b64AudioBuffer = new ConcurrentLinkedQueue<>();
private Queue<byte[]> rawAudioBuffer = new ConcurrentLinkedQueue<>();
public RealtimePcmPlayer(int sampleRate) throws LineUnavailableException {
this.sampleRate = sampleRate;
AudioFormat audioFormat = new AudioFormat(this.sampleRate, 16, 1, true, false);
DataLine.Info info = new DataLine.Info(SourceDataLine.class, audioFormat);
line = (SourceDataLine) AudioSystem.getLine(info);
line.open(audioFormat);
line.start();
decoderThread = new Thread(() -> {
while (!stopped.get()) {
String b64Audio = b64AudioBuffer.poll();
if (b64Audio != null) {
rawAudioBuffer.add(Base64.getDecoder().decode(b64Audio));
} else {
try { Thread.sleep(100); } catch (InterruptedException e) { throw new RuntimeException(e); }
}
}
});
playerThread = new Thread(() -> {
while (!stopped.get()) {
byte[] rawAudio = rawAudioBuffer.poll();
if (rawAudio != null) {
int bytesWritten = 0;
while (bytesWritten < rawAudio.length) {
bytesWritten += line.write(rawAudio, bytesWritten, rawAudio.length - bytesWritten);
}
int audioLength = rawAudio.length / (this.sampleRate * 2 / 1000);
try { Thread.sleep(audioLength - 10); } catch (InterruptedException e) { throw new RuntimeException(e); }
} else {
try { Thread.sleep(100); } catch (InterruptedException e) { throw new RuntimeException(e); }
}
}
});
decoderThread.start();
playerThread.start();
}
public void write(String b64Audio) { b64AudioBuffer.add(b64Audio); }
public void waitForComplete() throws InterruptedException {
while (!b64AudioBuffer.isEmpty() || !rawAudioBuffer.isEmpty()) { Thread.sleep(100); }
line.drain();
}
public void shutdown() throws InterruptedException {
stopped.set(true);
decoderThread.join();
playerThread.join();
if (line != null && line.isRunning()) { line.drain(); line.close(); }
}
}
public static void main(String[] args) throws Exception {
QwenTtsRealtimeParam param = QwenTtsRealtimeParam.builder()
.model("qwen3-tts-vd-realtime-2026-01-15")
.url("wss://dashscope-intl.aliyuncs.com/api-ws/v1/realtime")
.apikey(System.getenv("DASHSCOPE_API_KEY"))
.build();
AtomicReference<CountDownLatch> completeLatch = new AtomicReference<>(new CountDownLatch(1));
RealtimePcmPlayer audioPlayer = new RealtimePcmPlayer(24000);
QwenTtsRealtime qwenTtsRealtime = new QwenTtsRealtime(param, new QwenTtsRealtimeCallback() {
@Override
public void onOpen() { }
@Override
public void onEvent(JsonObject message) {
String type = message.get("type").getAsString();
switch (type) {
case "response.audio.delta":
audioPlayer.write(message.get("delta").getAsString());
break;
case "session.finished":
completeLatch.get().countDown();
break;
}
}
@Override
public void onClose(int code, String reason) { }
});
try {
qwenTtsRealtime.connect();
} catch (NoApiKeyException e) {
throw new RuntimeException(e);
}
QwenTtsRealtimeConfig config = QwenTtsRealtimeConfig.builder()
.voice("<your-voice-name>") // Replace with your voice design voice name
.responseFormat(QwenTtsRealtimeAudioFormat.PCM_24000HZ_MONO_16BIT)
.mode("server_commit")
.build();
qwenTtsRealtime.updateSession(config);
for (String text : textToSynthesize) {
qwenTtsRealtime.appendText(text);
Thread.sleep(100);
}
qwenTtsRealtime.finish();
completeLatch.get().await();
audioPlayer.waitForComplete();
audioPlayer.shutdown();
System.exit(0);
}
}
Non-streaming and unidirectional streaming
Use with qwen3-tts-vd-2026-01-26. Pass the returned voice name to the synthesis API with the matching model. See Qwen TTS for details and code examples.
Query voices
# Query a specific voice
curl -X POST https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen-voice-design",
"input": {
"action": "query",
"voice": "<your-voice-name>"
}
}'
# List all voices (paginated)
curl -X POST https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen-voice-design",
"input": {
"action": "list",
"page_size": 10,
"page_index": 0
}
}'
import requests
import os
def query_voice(voice_name):
"""Get details for a specific voice."""
api_key = os.getenv("DASHSCOPE_API_KEY")
response = requests.post(
"https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization",
headers={
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
},
json={
"model": "qwen-voice-design",
"input": {
"action": "query",
"voice": voice_name
}
}
)
if response.status_code == 200:
result = response.json()
print(f"Voice: {result['output']['voice']}")
print(f"Model: {result['output']['target_model']}")
print(f"Created: {result['output']['gmt_create']}")
return result
else:
error = response.json()
if error.get("code") == "VoiceNotFound":
print(f"Voice not found: {voice_name}")
else:
print(f"Request failed ({response.status_code}): {response.text}")
return None
def list_voices(page_index=0, page_size=10):
"""List all voices with pagination."""
api_key = os.getenv("DASHSCOPE_API_KEY")
response = requests.post(
"https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization",
headers={
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
},
json={
"model": "qwen-voice-design",
"input": {
"action": "list",
"page_size": page_size,
"page_index": page_index
}
}
)
if response.status_code == 200:
result = response.json()
total = result["output"]["total_count"]
voices = result["output"]["voice_list"]
print(f"Total voices: {total}")
for v in voices:
print(f" - {v['voice']} ({v['language']}, {v['target_model']})")
return result
else:
print(f"Request failed ({response.status_code}): {response.text}")
return None
if __name__ == "__main__":
list_voices()
Query a specific voice:import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.*;
import java.net.HttpURLConnection;
import java.net.URL;
public class Main {
public static void main(String[] args) {
Main example = new Main();
String voiceName = "<your-voice-name>"; // Replace with the actual voice name
System.out.println("Querying voice: " + voiceName);
example.queryVoice(voiceName);
}
public void queryVoice(String voiceName) {
String apiKey = System.getenv("DASHSCOPE_API_KEY");
String jsonBody = "{\n" +
" \"model\": \"qwen-voice-design\",\n" +
" \"input\": {\n" +
" \"action\": \"query\",\n" +
" \"voice\": \"" + voiceName + "\"\n" +
" }\n" +
"}";
HttpURLConnection connection = null;
try {
URL url = new URL("https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization");
connection = (HttpURLConnection) url.openConnection();
connection.setRequestMethod("POST");
connection.setRequestProperty("Authorization", "Bearer " + apiKey);
connection.setRequestProperty("Content-Type", "application/json");
connection.setDoOutput(true);
connection.setDoInput(true);
try (OutputStream os = connection.getOutputStream()) {
byte[] input = jsonBody.getBytes("UTF-8");
os.write(input, 0, input.length);
os.flush();
}
int responseCode = connection.getResponseCode();
if (responseCode == HttpURLConnection.HTTP_OK) {
StringBuilder response = new StringBuilder();
try (BufferedReader br = new BufferedReader(
new InputStreamReader(connection.getInputStream(), "UTF-8"))) {
String responseLine;
while ((responseLine = br.readLine()) != null) {
response.append(responseLine.trim());
}
}
JsonObject jsonResponse = JsonParser.parseString(response.toString()).getAsJsonObject();
if (jsonResponse.has("code") && "VoiceNotFound".equals(jsonResponse.get("code").getAsString())) {
String errorMessage = jsonResponse.has("message") ?
jsonResponse.get("message").getAsString() : "Voice not found";
System.out.println("Voice not found: " + voiceName);
System.out.println("Error message: " + errorMessage);
return;
}
JsonObject outputObj = jsonResponse.getAsJsonObject("output");
System.out.println("Successfully queried voice information:");
System.out.println(" Voice Name: " + outputObj.get("voice").getAsString());
System.out.println(" Creation Time: " + outputObj.get("gmt_create").getAsString());
System.out.println(" Modification Time: " + outputObj.get("gmt_modified").getAsString());
System.out.println(" Language: " + outputObj.get("language").getAsString());
System.out.println(" Preview Text: " + outputObj.get("preview_text").getAsString());
System.out.println(" Model: " + outputObj.get("target_model").getAsString());
System.out.println(" Voice Description: " + outputObj.get("voice_prompt").getAsString());
} else {
StringBuilder errorResponse = new StringBuilder();
try (BufferedReader br = new BufferedReader(
new InputStreamReader(connection.getErrorStream(), "UTF-8"))) {
String responseLine;
while ((responseLine = br.readLine()) != null) {
errorResponse.append(responseLine.trim());
}
}
System.out.println("Request failed with status code: " + responseCode);
System.out.println("Error response: " + errorResponse.toString());
}
} catch (Exception e) {
System.err.println("An error occurred during the request: " + e.getMessage());
e.printStackTrace();
} finally {
if (connection != null) {
connection.disconnect();
}
}
}
}
List all voices (paginated):import com.google.gson.Gson;
import com.google.gson.JsonArray;
import com.google.gson.JsonObject;
import java.io.*;
import java.net.HttpURLConnection;
import java.net.URL;
public class Main {
public static void main(String[] args) {
String apiKey = System.getenv("DASHSCOPE_API_KEY");
String apiUrl = "https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization";
String jsonPayload =
"{"
+ "\"model\": \"qwen-voice-design\","
+ "\"input\": {"
+ "\"action\": \"list\","
+ "\"page_size\": 10,"
+ "\"page_index\": 0"
+ "}"
+ "}";
try {
HttpURLConnection con = (HttpURLConnection) new URL(apiUrl).openConnection();
con.setRequestMethod("POST");
con.setRequestProperty("Authorization", "Bearer " + apiKey);
con.setRequestProperty("Content-Type", "application/json");
con.setDoOutput(true);
try (OutputStream os = con.getOutputStream()) {
os.write(jsonPayload.getBytes("UTF-8"));
}
int status = con.getResponseCode();
BufferedReader br = new BufferedReader(new InputStreamReader(
status >= 200 && status < 300 ? con.getInputStream() : con.getErrorStream(), "UTF-8"));
StringBuilder response = new StringBuilder();
String line;
while ((line = br.readLine()) != null) {
response.append(line);
}
br.close();
System.out.println("HTTP Status Code: " + status);
if (status == 200) {
Gson gson = new Gson();
JsonObject jsonObj = gson.fromJson(response.toString(), JsonObject.class);
JsonArray voiceList = jsonObj.getAsJsonObject("output").getAsJsonArray("voice_list");
System.out.println("\nQueried voice list:");
for (int i = 0; i < voiceList.size(); i++) {
JsonObject voiceItem = voiceList.get(i).getAsJsonObject();
String voice = voiceItem.get("voice").getAsString();
String gmtCreate = voiceItem.get("gmt_create").getAsString();
String targetModel = voiceItem.get("target_model").getAsString();
System.out.printf("- Voice: %s Creation Time: %s Model: %s\n",
voice, gmtCreate, targetModel);
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
Delete a voice
curl -X POST https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen-voice-design",
"input": {
"action": "delete",
"voice": "<your-voice-name>"
}
}'
import requests
import os
def delete_voice(voice_name):
"""Delete a voice and release the quota."""
api_key = os.getenv("DASHSCOPE_API_KEY")
response = requests.post(
"https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization",
headers={
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
},
json={
"model": "qwen-voice-design",
"input": {
"action": "delete",
"voice": voice_name
}
}
)
if response.status_code == 200:
print(f"Deleted: {voice_name}")
return True
else:
print(f"Request failed ({response.status_code}): {response.text}")
return False
if __name__ == "__main__":
delete_voice("<your-voice-name>")
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import java.io.*;
import java.net.HttpURLConnection;
import java.net.URL;
public class Main {
public static void main(String[] args) {
Main example = new Main();
String voiceName = "<your-voice-name>"; // Replace with the actual voice name
System.out.println("Deleting voice: " + voiceName);
example.deleteVoice(voiceName);
}
public void deleteVoice(String voiceName) {
String apiKey = System.getenv("DASHSCOPE_API_KEY");
String jsonBody = "{\n" +
" \"model\": \"qwen-voice-design\",\n" +
" \"input\": {\n" +
" \"action\": \"delete\",\n" +
" \"voice\": \"" + voiceName + "\"\n" +
" }\n" +
"}";
HttpURLConnection connection = null;
try {
URL url = new URL("https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization");
connection = (HttpURLConnection) url.openConnection();
connection.setRequestMethod("POST");
connection.setRequestProperty("Authorization", "Bearer " + apiKey);
connection.setRequestProperty("Content-Type", "application/json");
connection.setDoOutput(true);
connection.setDoInput(true);
try (OutputStream os = connection.getOutputStream()) {
byte[] input = jsonBody.getBytes("UTF-8");
os.write(input, 0, input.length);
os.flush();
}
int responseCode = connection.getResponseCode();
if (responseCode == HttpURLConnection.HTTP_OK) {
StringBuilder response = new StringBuilder();
try (BufferedReader br = new BufferedReader(
new InputStreamReader(connection.getInputStream(), "UTF-8"))) {
String responseLine;
while ((responseLine = br.readLine()) != null) {
response.append(responseLine.trim());
}
}
JsonObject jsonResponse = JsonParser.parseString(response.toString()).getAsJsonObject();
if (jsonResponse.has("code") && jsonResponse.get("code").getAsString().contains("VoiceNotFound")) {
String errorMessage = jsonResponse.has("message") ?
jsonResponse.get("message").getAsString() : "Voice not found";
System.out.println("Voice does not exist: " + voiceName);
System.out.println("Error message: " + errorMessage);
} else if (jsonResponse.has("usage")) {
System.out.println("Voice deleted successfully: " + voiceName);
String requestId = jsonResponse.has("request_id") ?
jsonResponse.get("request_id").getAsString() : "N/A";
System.out.println("Request ID: " + requestId);
} else {
System.out.println("Unexpected response format: " + response.toString());
}
} else {
StringBuilder errorResponse = new StringBuilder();
try (BufferedReader br = new BufferedReader(
new InputStreamReader(connection.getErrorStream(), "UTF-8"))) {
String responseLine;
while ((responseLine = br.readLine()) != null) {
errorResponse.append(responseLine.trim());
}
}
System.out.println("Request failed with status code: " + responseCode);
System.out.println("Error response: " + errorResponse.toString());
}
} catch (Exception e) {
System.err.println("An error occurred during the request: " + e.getMessage());
e.printStackTrace();
} finally {
if (connection != null) {
connection.disconnect();
}
}
}
}
Voice quota and cleanup
- Account limit: 1,000 voices per account. Check the
total_count field in the List voices response.
- Automatic cleanup: Voices unused for synthesis in the past year are deleted automatically.