Use the Python SDK to create, list, query, update, and delete custom vocabularies for speech recognition.
User guide: Custom hotwords.
Set the base URL before creating the service:
Package:
If
Create a custom vocabulary.
Returns:
List custom vocabularies with optional filtering and pagination.
Returns:
Query details of a specific custom vocabulary.
Returns:
Update a custom vocabulary. This completely replaces the existing entries.
Returns:
Delete a custom vocabulary.
Returns:
Each entry in the
Prerequisites
- An API key configured as the
DASHSCOPE_API_KEYenvironment variable - The latest DashScope SDK
Service URL
Set the base URL before creating the service:
VocabularyService class
Package: dashscope.audio.asr.VocabularyService
Manages the lifecycle of custom vocabularies (create, list, query, update, delete).
Constructor
api_key is not passed, the SDK uses the global dashscope.api_key.
create_vocabulary()
Create a custom vocabulary.
| Parameter | Type | Required | Description |
|---|---|---|---|
| target_model | str | Yes | The speech recognition model that uses this vocabulary. Must match the model you specify when calling the speech recognition API. |
| prefix | str | Yes | A custom prefix for the vocabulary. Only lowercase letters and digits are allowed, max 10 characters. |
| vocabulary | List[dict] | Yes | A list of hotwords. See Hotword entry structure. |
str — the ID of the created vocabulary.
list_vocabularies()
List custom vocabularies with optional filtering and pagination.
The HTTP API uses the singular form
list_vocabulary, while the Python method name uses the plural list_vocabularies.| Parameter | Type | Required | Description |
|---|---|---|---|
| prefix | str | No | Filter by vocabulary prefix. |
| page_index | int | No | Page number, starting from 0. Default: 0. |
| page_size | int | No | Number of entries per page. Default: 10. |
List[dict] — a list of vocabulary objects, each containing:
| Field | Type | Description |
|---|---|---|
| vocabulary_id | str | The vocabulary ID. |
| gmt_create | str | The creation time. |
| gmt_modified | str | The last modification time. |
| status | str | OK: Ready. UNDEPLOYED: Not available. |
query_vocabulary()
Query details of a specific custom vocabulary.
| Parameter | Type | Required | Description |
|---|---|---|---|
| vocabulary_id | str | Yes | The ID of the custom vocabulary to query. |
dict — a vocabulary object containing:
| Field | Type | Description |
|---|---|---|
| vocabulary | List[dict] | The hotword list content. |
| target_model | str | The speech recognition model that uses this vocabulary. |
| gmt_create | str | The creation time. |
| gmt_modified | str | The last modification time. |
| status | str | OK: Ready. UNDEPLOYED: Not available. |
update_vocabulary()
Update a custom vocabulary. This completely replaces the existing entries.
| Parameter | Type | Required | Description |
|---|---|---|---|
| vocabulary_id | str | Yes | The ID of the vocabulary to update. |
| vocabulary | List[dict] | Yes | The new vocabulary entries. See Hotword entry structure. |
None
delete_vocabulary()
Delete a custom vocabulary.
| Parameter | Type | Required | Description |
|---|---|---|---|
| vocabulary_id | str | Yes | The ID of the vocabulary to delete. |
None
Hotword entry structure
Each entry in the vocabulary list has the following fields:
| Field | Type | Required | Description |
|---|---|---|---|
| text | str | Yes | The vocabulary entry text. The text language must be supported by the selected model. Use actual words rather than arbitrary character combinations. Maximum length: 15 characters for text that includes non-ASCII characters, or 7 space-separated words for ASCII-only text. |
| weight | int | Yes | The vocabulary entry weight. Recommended value: 4. Valid values: 1 to 5. If recognition accuracy doesn't improve, increase the weight. An excessively high weight may reduce the recognition accuracy of other words. |
| lang | str | No | The language code of the audio to be recognized. When set, the system improves recognition of vocabulary entries in the specified language. If you can't determine the language in advance, leave this parameter unset. Valid values vary by model. Fun-ASR: zh (Chinese), en (English), ja (Japanese). |