Skip to main content
CosyVoice SDK

Python SDK

CosyVoice voice cloning Python SDK reference (VoiceEnrollmentService).

CosyVoice voice cloning can be called via the DashScope Python SDK using the VoiceEnrollmentService class. This SDK covers voice cloning only — CosyVoice voice design and all Qwen voice cloning/design must use the HTTP API. User guide: Voice cloning.

Prerequisites

Service URL

Set the base URL before creating the service:
import dashscope

dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'

VoiceEnrollmentService class

Package: dashscope.audio.tts_v2.VoiceEnrollmentService Manages the lifecycle of CosyVoice cloned voices (create, list, query, update, delete).

Constructor

VoiceEnrollmentService()

create_voice()

Create a cloned voice from audio.
def create_voice(self, target_model: str, prefix: str, url: str,
                 language_hints: List[str] = None,
                 max_prompt_audio_length: float = None,
                 enable_preprocess: bool = None) -> str
ParameterTypeRequiredDescription
target_modelstrYesSpeech synthesis model for the cloned voice. Must match the model in your synthesis calls.
prefixstrYesVoice name prefix. Alphanumeric only, max 10 characters. Generated name format: {target_model}-{prefix}-{unique_id}.
urlstrYesAudio file URL for cloning. Must be publicly accessible.
language_hintsList[str]NoLanguage hint for the audio. Only the first element is used. Default: ["zh"].
max_prompt_audio_lengthfloatNoMax audio duration (seconds) after preprocessing. Range: [3.0, 30.0]. Default: 10.0.
enable_preprocessboolNoEnable audio preprocessing (noise reduction, enhancement). Default: False.
Returns: str — the generated voice ID (voice_id).

list_voice()

List cloned voices with optional filtering and pagination.
def list_voice(self, prefix: str = None, page_index: int = 0, page_size: int = 10) -> list
ParameterTypeRequiredDescription
prefixstrNoFilter voices by name prefix.
page_indexintNoPage number, starting from 0. Default: 0.
page_sizeintNoResults per page. Default: 10.
Returns: list — list of voice objects.

query_voice()

Query details of a specific cloned voice.
def query_voice(self, voice_id: str) -> dict
ParameterTypeRequiredDescription
voice_idstrYesThe voice ID to query.
Returns: dict — voice details including status, resource_link, target_model, etc.

update_voice()

Update a cloned voice with new audio.
def update_voice(self, voice_id: str, url: str,
                 language_hints: List[str] = None,
                 max_prompt_audio_length: float = None,
                 enable_preprocess: bool = None) -> None
ParameterTypeRequiredDescription
voice_idstrYesThe voice ID to update.
urlstrYesNew audio file URL. Must be publicly accessible.
language_hintsList[str]NoLanguage hint for the new audio.
max_prompt_audio_lengthfloatNoMax audio duration (seconds) after preprocessing.
enable_preprocessboolNoEnable audio preprocessing.
Returns: None

delete_voice()

Delete a cloned voice.
def delete_voice(self, voice_id: str) -> None
ParameterTypeRequiredDescription
voice_idstrYesThe voice ID to delete.
Returns: None