Skip to main content
Speech-to-text

Custom hotwords

Boost term recognition

Hotwords help the model recognize terms it might otherwise miss -- business terms, product names, or proper nouns.

Hotwords overview

Submit a JSON array of hotword objects. Example: Improve movie title recognition (Fun-ASR and Paraformer series models)
[
  {"text": "赛德克巴莱", "weight": 4, "lang": "zh"},
  {"text": "Seediq Bale", "weight": 4, "lang": "en"},
  {"text": "夏洛特烦恼", "weight": 4, "lang": "zh"},
  {"text": "Goodbye Mr. Loser", "weight": 4, "lang": "en"},
  {"text": "阙里人家", "weight": 4, "lang": "zh"},
  {"text": "Confucius' Family", "weight": 4, "lang": "en"}
]
Field descriptions:
FieldTypeRequiredDescription
textstringYesThe hotword text. Must be supported by the selected model. Use actual words, not random characters. See length rules below.
weightintYesPriority weight, an integer from 1 to 5. Start with 4. Increase if results are weak, but too high a weight can hurt recognition of other words.
langstringNoLanguage code. Boosts hotwords for a specific language. Leave empty for auto-detection. See the model's API reference for supported codes. If you set language_hints, only matching hotwords take effect.
Hotword text length rules:
  • Contains non-ASCII characters: Maximum 15 characters total, including non-ASCII characters (Chinese, Japanese kana, Korean Hangul, Russian Cyrillic) and ASCII characters. Examples:
    • "厄洛替尼盐酸盐" (7 Chinese characters)
    • "EGFR抑制剂" (3 Chinese characters and 4 ASCII characters, for a total of 7 characters)
    • "こんにちは" (5 characters)
    • "Фенибут Белфарм" (15 characters, including the space)
    • "Клофелин Белмедпрепараты" (24 characters) -- exceeds limit
  • Contains only ASCII characters: Maximum 7 segments. A segment is a sequence of characters separated by spaces. Examples:
    • "Exothermic reaction" -- 2 segments
    • "Human immunodeficiency virus type 1" -- 5 segments
    • "The effect of temperature variations on enzyme activity in biochemical reactions" -- 11 segments, exceeds limit

Supported models

Fun-ASR:
  • Real-time speech recognition: fun-asr-realtime, fun-asr-realtime-2025-11-07
  • Audio file recognition: fun-asr, fun-asr-2025-11-07, fun-asr-2025-08-25, fun-asr-mtl, fun-asr-mtl-2025-08-25

Billing

Hotwords are free.

Hotword quantity limits

  1. Each account can create up to 10 hotword lists, shared across all models. To increase this limit, submit a request.
  2. Each hotword list can have up to 500 words.

Getting started

Workflow

  1. Create a hotword list by calling the Create API. Set target_model (or targetModel in Java) to the speech recognition model you plan to use. If you already have a list, skip this step and call Query all to view it.
  2. Pass the hotword list ID to the speech recognition API. The model must match the target_model (or targetModel in Java) from step 1.

Prerequisites

  1. Get an API key: Get your API key and export it as an environment variable.
  2. Install the SDK: Install the DashScope SDK.

Code examples

Audio file used in the examples: asr_example.wav.
  • Python
  • Java
import dashscope
from dashscope.audio.asr import *
import os


# If you have not configured an environment variable, replace the following line with your API key: dashscope.api_key = "sk-xxx"
dashscope.api_key = os.environ.get('DASHSCOPE_API_KEY')

dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'
dashscope.base_websocket_api_url='wss://dashscope-intl.aliyuncs.com/api-ws/v1/inference'
prefix = 'testpfx'
target_model = "fun-asr-realtime"

my_vocabulary = [
  {"text": "Speech Lab", "weight": 4}
]

service = VocabularyService()
vocabulary_id = service.create_vocabulary(
      prefix=prefix,
      target_model=target_model,
      vocabulary=my_vocabulary)

if service.query_vocabulary(vocabulary_id)['status'] == 'OK':
  recognition = Recognition(model=target_model,
                          format='wav',
                          sample_rate=16000,
                          callback=None,
                          vocabulary_id=vocabulary_id)
  result = recognition.call('asr_example.wav')
  print(result.output)

service.delete_vocabulary(vocabulary_id)

API reference

Use the same account for all operations.

Create a hotword list

For the hotword list JSON format, see Hotwords overview.
  • Python SDK
  • Java SDK
  • RESTful API
API descriptiontarget_model must match the model used in your speech recognition calls.
def create_vocabulary(self, target_model: str, prefix: str, vocabulary: List[dict]) -> str:
  '''
  Create a hotword list.
  param: target_model The speech recognition model (must match your recognition calls).
  param: prefix Custom prefix (<10 lowercase letters/digits).
  param: vocabulary The hotword list.
  return: The hotword list ID.
  '''
Code example
import dashscope
from dashscope.audio.asr import *
import os

# If you have not configured an environment variable, replace the following line with your API key: dashscope.api_key = "sk-xxx"
dashscope.api_key = os.environ.get('DASHSCOPE_API_KEY')

dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'

prefix = 'testpfx'
target_model = "fun-asr"

my_vocabulary = [
  {"text": "Seediq Bale", "weight": 4}
]

# Create a hotword
service = VocabularyService()
vocabulary_id = service.create_vocabulary(
  prefix=prefix,
  target_model=target_model,
  vocabulary=my_vocabulary)

print(f"The hotword list ID is: {vocabulary_id}")

Query all hotword lists

  • Python SDK
  • Java SDK
  • RESTful API
API description
def list_vocabularies(self, prefix=None, page_index: int = 0, page_size: int = 10) -> List[dict]:
  '''
  List all hotword lists.
  param: prefix Filter by prefix. Returns only matching lists.
  param: page_index Page index.
  param: page_size Page size.
  return: A list of hotword list identifiers.
  '''
Code example
import dashscope
from dashscope.audio.asr import *
import json
import os

# If you have not configured an environment variable, replace the following line with your API key: dashscope.api_key = "sk-xxx"
dashscope.api_key = os.environ.get('DASHSCOPE_API_KEY')

dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'

service = VocabularyService()
vocabularies = service.list_vocabularies()
print(f"Hotword list: {json.dumps(vocabularies)}")
Response example
[
  {
  "gmt_create": "2025-04-22 14:23:35",
  "vocabulary_id": "vocab-testpfx-5112c3de3705486baxxxxxxx",
  "gmt_modified": "2025-04-22 14:23:35",
  "status": "OK"
  }
]

Query a specific hotword list

  • Python SDK
  • Java SDK
  • RESTful API
API description
def query_vocabulary(self, vocabulary_id: str) -> List[dict]:
  '''
  Get a hotword list by ID.
  param: vocabulary_id The hotword list ID.
  return: The hotword list.
  '''
Code example
import dashscope
from dashscope.audio.asr import *
import json
import os

# If you have not configured an environment variable, replace the following line with your API key: dashscope.api_key = "sk-xxx"
dashscope.api_key = os.environ.get('DASHSCOPE_API_KEY')

dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'

service = VocabularyService()
# Replace with your actual hotword list ID when querying.
vocabulary = service.query_vocabulary("vocab-testpfx-xxx")
print(f"Hotword list: {json.dumps(vocabulary, ensure_ascii=False)}")
Response example
{
  "gmt_create": "2025-12-19 11:47:11",
  "gmt_modified": "2025-12-19 11:47:11",
  "status": "OK",
  "target_model": "fun-asr",
  "vocabulary": [
  {
      "lang": "zh",
      "text": "Seediq Bale",
      "weight": 4
  }
  ]
}

Update a hotword list

  • Python SDK
  • Java SDK
  • RESTful API
API description
def update_vocabulary(self, vocabulary_id: str, vocabulary: List[dict]) -> None:
  '''
  Replace a hotword list.
  param: vocabulary_id The hotword list ID to replace.
  param: vocabulary The new hotword list.
  '''
Code example
import dashscope
from dashscope.audio.asr import *
import os

# If you have not configured an environment variable, replace the following line with your API key: dashscope.api_key = "sk-xxx"
dashscope.api_key = os.environ.get('DASHSCOPE_API_KEY')

dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'

service = VocabularyService()
my_vocabulary = [
  {"text": "Seediq Bale", "weight": 4, "lang": "zh"}
]
# Replace with your actual hotword list ID.
service.update_vocabulary("vocab-testpfx-xxx", my_vocabulary)

Delete a hotword list

  • Python SDK
  • Java SDK
  • RESTful API
API description
def delete_vocabulary(self, vocabulary_id: str) -> None:
  '''
  Delete a hotword list.
  param: vocabulary_id The hotword list ID to delete.
  '''
Code example
import dashscope
from dashscope.audio.asr import *
import os

# If you have not configured an environment variable, replace the following line with your API key: dashscope.api_key = "sk-xxx"
dashscope.api_key = os.environ.get('DASHSCOPE_API_KEY')

dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'

service = VocabularyService()
# Replace with your actual hotword list ID.
service.delete_vocabulary("vocab-testpfx-xxxx")