Fun-ASR recording Python SDK

User guide: For model details and recommendations, see Audio file recognition - Fun-ASR/Paraformer.

Prerequisites

For temporary access to third-party apps, use a temporary token. Tokens expire in 60 seconds, limiting leakage risk.

Install the latest DashScope SDK.

Model availability

Model	Version	Unit price	Free quota (Note)
fun-asr Currently, fun-asr-2025-11-07	Stable	$0.000035/second	36,000 seconds (10 hours) Valid for 90 days
fun-asr-2025-11-07 Improved far-field VAD over fun-asr-2025-08-25 for higher accuracy	Snapshot	$0.000035/second	36,000 seconds (10 hours) Valid for 90 days
fun-asr-2025-08-25	Snapshot	$0.000035/second	36,000 seconds (10 hours) Valid for 90 days
fun-asr-mtl Currently, fun-asr-mtl-2025-08-25	Stable	$0.000035/second	36,000 seconds (10 hours) Valid for 90 days
fun-asr-mtl-2025-08-25	Snapshot	$0.000035/second	36,000 seconds (10 hours) Valid for 90 days

Supported languages:
- fun-asr, fun-asr-2025-11-07, fun-asr-mtl, and fun-asr-mtl-2025-08-25: Chinese (Mandarin, Cantonese, Wu, Minnan, Hakka, Gan, Xiang, and Jin; also supports Mandarin accents from Zhongyuan, Southwest, Jilu, Jianghuai, Lanyin, Jiaoliao, Northeast, Beijing, and Hong Kong-Taiwan regions -- including Henan, Shaanxi, Hubei, Sichuan, Chongqing, Yunnan, Guizhou, Guangdong, Guangxi, Hebei, Tianjin, Shandong, Anhui, Nanjing, Jiangsu, Hangzhou, Gansu, and Ningxia), English, Japanese, Korean, Vietnamese, Indonesian, Thai, Malay, Filipino, Arabic, Bulgarian, Croatian, Czech, Danish, Dutch, Estonian, Finnish, Greek, Hindi, Hungarian, Irish, Latvian, Lithuanian, Maltese, Polish, Portuguese, Romanian, Slovak, Slovenian, and Swedish.
- fun-asr-2025-08-25: Mandarin and English.
Sample rates supported: Any
Audio formats supported: aac, amr, avi, flac, flv, m4a, mkv, mov, mp3, mp4, mpeg, ogg, opus, wav, webm, wma, wmv

Limitations

Files must be at public URLs (HTTP/HTTPS, such as https://your-domain.com/file.mp3). Local files and Base64 encoding are not supported. Pass URLs with the file_urls parameter. Up to 100 URLs per request.

Audio formats: aac, amr, avi, flac, flv, m4a, mkv, mov, mp3, mp4, mpeg, ogg, opus, wav, webm, wma, wmv

Not all format variants are tested. Test your files to verify results.

Audio sample rate: Any
File size and duration: Max 2 GB and 12 hours. For larger files, see Audio trimming.
Batch processing: Up to 100 URLs per request.
Languages: fun-asr, fun-asr-mtl, and their snapshot versions support Chinese and 29 other languages. fun-asr-2025-08-25 supports Chinese and English only. See Supported languages.

Request parameters

Pass these parameters to async_call on the Transcription class.

Parameter	Type	Default	Required	Description
model	str	-	Yes	Model ID. See Model availability.
file_urls	list[str]	-	Yes	Audio/video file URLs (HTTP/HTTPS). Up to 100 per request.
vocabulary_id	str	-	No	Hotword vocabulary ID for this task. Disabled by default. See Customize hotwords.
channel_id	list[int]	[0]	No	Audio track indexes to recognize (0-based). `[0]` = first track, `[0, 1]` = first and second. Each track is billed separately.
special_word_filter	str	-	No	Sensitive word filter config. See Sensitive word filter.
diarization_enabled	bool	False	No	Enable speaker diarization (single-channel only). Results include `speaker_id`. See Recognition result.
speaker_count	int	-	No	Expected speaker count (2-100). Only applies when `diarization_enabled` is true. Auto-detected by default. Guides the algorithm but does not guarantee exact count.
language_hints	list[str]	["zh", "en"]	No	Language codes. Leave unset for auto-detection. See Supported languages.
speech_noise_threshold	float	-	No	Speech noise threshold.

Sensitive word filter

By default, words on the Qwen Cloud sensitive word list are replaced with asterisks (*). With special_word_filter, you can:

Replace with *: Matched words become asterisks.
Filter out: Matched words are removed.

Value must be a JSON string:

{
  "filter_with_signed": {
    "word_list": ["test"]
  },
  "filter_with_empty": {
    "word_list": ["start", "happen"]
  },
  "system_reserved_filter": true
}

Fields:

filter_with_signed (object, optional): Words to replace with *.
- Example: "Help me test this code" becomes "Help me **** this code"
- word_list: Words to replace.
filter_with_empty (object, optional): Words to remove.
- Example: "Is the game about to start?" becomes "Is the game about to?"
- word_list: Words to remove.
system_reserved_filter (boolean, optional, default: true): Enable system filtering. When true, words on the Qwen Cloud sensitive word list are replaced with *.

Supported languages

Language codes by model:

fun-asr, fun-asr-2025-11-07, fun-asr-mtl, fun-asr-mtl-2025-08-25:
- zh: Chinese
- en: English
- ja: Japanese
- ko: Korean
- vi: Vietnamese
- id: Indonesian
- th: Thai
- ms: Malay
- tl: Filipino
- ar: Arabic
- bg: Bulgarian
- hr: Croatian
- cs: Czech
- da: Danish
- nl: Dutch
- et: Estonian
- fi: Finnish
- el: Greek
- hi: Hindi
- hu: Hungarian
- ga: Irish
- lv: Latvian
- lt: Lithuanian
- mt: Maltese
- pl: Polish
- pt: Portuguese
- ro: Romanian
- sk: Slovak
- sl: Slovenian
- sv: Swedish
fun-asr-2025-08-25:
- zh: Chinese
- en: English

Response results

TranscriptionResponse

TranscriptionResponse contains task info (task_id, task_status) and results in output. See TranscriptionOutput.

Click to view a sample TranscriptionResponse structure

PENDING status
RUNNING status
SUCCEEDED status
FAILED status

{
  "status_code": 200,
  "request_id": "251aceab-a6aa-9fc4-b7f7-0cc6d3e2a9f3",
  "code": null,
  "message": "",
  "output": {
    "task_id": "7d0a58a3-1dbe-4de9-8cff-5f48213128b0",
    "task_status": "PENDING",
    "submit_time": "2025-02-13 16:55:08.573",
    "scheduled_time": "2025-02-13 16:55:08.592",
    "task_metrics": {
      "TOTAL": 2,
      "SUCCEEDED": 0,
      "FAILED": 0
    }
  },
  "usage": null
}

{
  "status_code": 200,
  "request_id": "d9d530f1-853c-9848-a5f1-f5de59086ff7",
  "code": null,
  "message": "",
  "output": {
    "task_id": "6351feef-9694-45d2-9d32-63454f2ffb8d",
    "task_status": "RUNNING",
    "submit_time": "2025-02-13 17:31:20.681",
    "scheduled_time": "2025-02-13 17:31:20.703",
    "task_metrics": {
      "TOTAL": 2,
      "SUCCEEDED": 1,
      "FAILED": 0
    }
  },
  "usage": null
}

{
  "status_code": 200,
  "request_id": "16668704-6702-9e03-8ab7-a32a5d7bb095",
  "code": null,
  "message": "",
  "output": {
    "task_id": "6351feef-9694-45d2-9d32-63454f2ffb8d",
    "task_status": "SUCCEEDED",
    "submit_time": "2025-02-13 17:31:20.681",
    "scheduled_time": "2025-02-13 17:31:20.703",
    "end_time": "2025-02-13 17:31:21.867",
    "results": [
      {
        "file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/samples/audio/paraformer/hello_world_female2.wav",
        "transcription_url": "https://dashscope-result-bj.oss-cn-beijing.aliyuncs.com/prod/paraformer-v2/...",
        "subtask_status": "SUCCEEDED"
      },
      {
        "file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/samples/audio/paraformer/hello_world_male2.wav",
        "transcription_url": "https://dashscope-result-bj.oss-cn-beijing.aliyuncs.com/prod/paraformer-v2/...",
        "subtask_status": "SUCCEEDED"
      }
    ],
    "task_metrics": {
      "TOTAL": 2,
      "SUCCEEDED": 2,
      "FAILED": 0
    }
  },
  "usage": {
    "duration": 9
  }
}

{
  "status_code": 200,
  "request_id": "16668704-6702-9e03-8ab7-a32a5d7bb095",
  "code": null,
  "message": "",
  "output": {
    "task_id": "7bac899c-06ec-4a79-8875-xxxxxxxxxxxx",
    "task_status": "SUCCEEDED",
    "submit_time": "2024-12-16 16:30:59.170",
    "scheduled_time": "2024-12-16 16:30:59.204",
    "end_time": "2024-12-16 16:31:02.375",
    "results": [
      {
        "file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/samples/audio/sensevoice/long_audio_demo_cn.mp3",
        "transcription_url": "https://dashscope-result-bj.oss-cn-beijing.aliyuncs.com/prod/paraformer-v2/20241216/xxxx",
        "subtask_status": "SUCCEEDED"
      },
      {
        "file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/samples/audio/sensevoice/rich_text_exaple_1.wav",
        "code": "InvalidFile.DownloadFailed",
        "message": "The audio file cannot be downloaded.",
        "subtask_status": "FAILED"
      }
    ],
    "task_metrics": {
      "TOTAL": 2,
      "SUCCEEDED": 1,
      "FAILED": 1
    }
  },
  "usage": {
    "duration": 9
  }
}

Key parameters:

Parameter	Description
status_code	HTTP status code.
code	Ignore top-level `code`. Check `output.results[].code` for errors.
message	Ignore top-level `message`. Check `output.results[].message` for errors.
task_id	Task ID.
task_status	Task status: `PENDING`, `RUNNING`, `SUCCEEDED`, `FAILED`. If any subtask succeeds, the task is `SUCCEEDED`. Check `subtask_status` for individual results.
results	Subtask results.
subtask_status	Subtask status: `PENDING`, `RUNNING`, `SUCCEEDED`, `FAILED`.
file_url	Audio file URL.
transcription_url	Result URL (JSON file). Download or read via HTTP. See Recognition result.

TranscriptionOutput

TranscriptionOutput is the output property of TranscriptionResponse.

Click to view a sample TranscriptionOutput structure

PENDING status
RUNNING status
SUCCEEDED status
FAILED status

{
  "task_id": "f2f7c2fa-0cd9-4bb2-a283-27b26ee4bb67",
  "task_status": "PENDING",
  "submit_time": "2025-02-13 17:59:27.754",
  "scheduled_time": "2025-02-13 17:59:27.789",
  "task_metrics": {
    "TOTAL": 2,
    "SUCCEEDED": 0,
    "FAILED": 0
  }
}

{
  "task_id": "f2f7c2fa-0cd9-4bb2-a283-27b26ee4bb67",
  "task_status": "RUNNING",
  "submit_time": "2025-02-13 17:59:27.754",
  "scheduled_time": "2025-02-13 17:59:27.789",
  "task_metrics": {
    "TOTAL": 2,
    "SUCCEEDED": 0,
    "FAILED": 0
  }
}

{
  "task_id": "f2f7c2fa-0cd9-4bb2-a283-27b26ee4bb67",
  "task_status": "SUCCEEDED",
  "submit_time": "2025-02-13 17:59:27.754",
  "scheduled_time": "2025-02-13 17:59:27.789",
  "end_time": "2025-02-13 17:59:28.828",
  "results": [
    {
      "file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/samples/audio/paraformer/hello_world_female2.wav",
      "transcription_url": "https://dashscope-result-bj.oss-cn-beijing.aliyuncs.com/prod/paraformer-v2/...",
      "subtask_status": "SUCCEEDED"
    },
    {
      "file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/samples/audio/paraformer/hello_world_male2.wav",
      "transcription_url": "https://dashscope-result-bj.oss-cn-beijing.aliyuncs.com/prod/paraformer-v2/...",
      "subtask_status": "SUCCEEDED"
    }
  ],
  "task_metrics": {
    "TOTAL": 2,
    "SUCCEEDED": 2,
    "FAILED": 0
  }
}

code and message appear only on errors.

{
  "task_id": "7bac899c-06ec-4a79-8875-xxxxxxxxxxxx",
  "task_status": "SUCCEEDED",
  "submit_time": "2024-12-16 16:30:59.170",
  "scheduled_time": "2024-12-16 16:30:59.204",
  "end_time": "2024-12-16 16:31:02.375",
  "results": [
    {
      "file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/samples/audio/sensevoice/long_audio_demo_cn.mp3",
      "transcription_url": "https://dashscope-result-bj.oss-cn-beijing.aliyuncs.com/prod/paraformer-v2/20241216/xxxx",
      "subtask_status": "SUCCEEDED"
    },
    {
      "file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/samples/audio/sensevoice/rich_text_exaple_1.wav",
      "code": "InvalidFile.DownloadFailed",
      "message": "The audio file cannot be downloaded.",
      "subtask_status": "FAILED"
    }
  ],
  "task_metrics": {
    "TOTAL": 2,
    "SUCCEEDED": 1,
    "FAILED": 1
  }
}

Key parameters:

Parameter	Description
code	Error code.
message	Error message.
task_id	Task ID.
task_status	Task status: `PENDING`, `RUNNING`, `SUCCEEDED`, `FAILED`. If any subtask succeeds, the task is `SUCCEEDED`. Check `subtask_status` for individual results.
results	Subtask results.
subtask_status	Subtask status: `PENDING`, `RUNNING`, `SUCCEEDED`, `FAILED`.
file_url	Audio file URL.
transcription_url	Result URL (JSON file). Download or read via HTTP. See Recognition result.

Recognition result

Results are JSON files.

Click to view a recognition result example

{
  "file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/samples/audio/paraformer/hello_world_female2.wav",
  "properties": {
    "audio_format": "pcm_s16le",
    "channels": [
      0
    ],
    "original_sampling_rate": 16000,
    "original_duration_in_milliseconds": 3834
  },
  "transcripts": [
    {
      "channel_id": 0,
      "content_duration_in_milliseconds": 3720,
      "text": "Hello world, this is Alibaba Speech Lab.",
      "sentences": [
        {
          "begin_time": 100,
          "end_time": 3820,
          "text": "Hello world, this is Alibaba Speech Lab.",
          "sentence_id": 1,
          "speaker_id": 0,
          "words": [
            {
              "begin_time": 100,
              "end_time": 596,
              "text": "Hello ",
              "punctuation": ""
            },
            {
              "begin_time": 596,
              "end_time": 844,
              "text": "world",
              "punctuation": ", "
            }
          ]
        }
      ]
    }
  ]
}

speaker_id appears only when speaker diarization is enabled.

Key parameters:

Parameter	Type	Description
audio_format	string	Audio format.
channels	array[integer]	Track indexes. `[0]` = single-track, `[0, 1]` = dual-track.
original_sampling_rate	integer	Sample rate (Hz).
original_duration_in_milliseconds	integer	Audio duration (ms).
channel_id	integer	Track index (0-based).
content_duration_in_milliseconds	integer	Speech duration (ms). Only speech is transcribed and billed. Non-speech is excluded. Speech duration is usually shorter than audio duration.
transcript	string	Paragraph-level text.
sentences	array	Sentence-level results.
words	array	Word-level results.
begin_time	integer	Start time (ms).
end_time	integer	End time (ms).
text	string	Transcription text.
speaker_id	integer	Speaker index (0-based). Only present when diarization is enabled.
punctuation	string	Predicted punctuation after the word.

Transcription class

Import with from dashscope.audio.asr import Transcription.

Method	Signature	Description
async_call	`@classmethod def async_call(cls, model: str, file_urls: List[str], phrase_id: str = None, api_key: str = None, workspace: str = None, **kwargs) -> TranscriptionResponse`	Submit a recognition task.
wait	`@classmethod def wait(cls, task: Union[str, TranscriptionResponse], api_key: str = None, workspace: str = None, **kwargs) -> TranscriptionResponse`	Block until done (`SUCCEEDED` or `FAILED`). Returns a TranscriptionResponse.
fetch	`@classmethod def fetch(cls, task: Union[str, TranscriptionResponse], api_key: str = None, workspace: str = None, **kwargs) -> TranscriptionResponse`	Query task status. Returns a TranscriptionResponse.

​Prerequisites

​Model availability

​Limitations

​Request parameters

​Sensitive word filter

​Supported languages

​Response results

​TranscriptionResponse

​TranscriptionOutput

​Recognition result

​Transcription class

Prerequisites

Model availability

Limitations

Request parameters

Sensitive word filter

Supported languages

Response results

TranscriptionResponse

TranscriptionOutput

Recognition result

Transcription class