Skip to main content
Non-realtime

OpenAI compatible ASR

Audio via Chat API

POST
/compatible-mode/v1/chat/completions
from openai import OpenAI
import os

try:
  client = OpenAI(
    # If you have not configured environment variables, replace the following line with your API key: api_key = "sk-xxx",
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
  )
  

  stream_enabled = False  # Whether to enable streaming output
  completion = client.chat.completions.create(
    model="qwen3-asr-flash",
    messages=[
      {
        "content": [
          {
            "type": "input_audio",
            "input_audio": {
              "data": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"
            }
          }
        ],
        "role": "user"
      }
    ],
    stream=stream_enabled,
    # When stream is set to False, the stream_options parameter cannot be set
    # stream_options={"include_usage": True},
    extra_body={
      "asr_options": {
        # "language": "zh",
        "enable_itn": False
      }
    }
  )
  if stream_enabled:
    full_content = ""
    print("Streaming output content is:")
    for chunk in completion:
      # If stream_options.include_usage is True, the choices field of the last chunk is an empty list and should be skipped (you can get token usage via chunk.usage)
      print(chunk)
      if chunk.choices and chunk.choices[0].delta.content:
        full_content += chunk.choices[0].delta.content
    print(f"Full content is: {full_content}")
  else:
    print(f"Non-streaming output content is: {completion.choices[0].message.content}")
except Exception as e:
  print(f"Error message: {e}")
{
  "id": "chatcmpl-487abe5f-d4f2-9363-a877-xxxxxxx",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "annotations": [
          {
            "emotion": "neutral",
            "language": "zh",
            "type": "audio_info"
          }
        ],
        "content": "Welcome to Qwen Cloud.",
        "role": "assistant"
      }
    }
  ],
  "created": 1767683986,
  "model": "qwen3-asr-flash",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 12,
    "completion_tokens_details": {
      "text_tokens": 12
    },
    "prompt_tokens": 42,
    "prompt_tokens_details": {
      "audio_tokens": 42,
      "text_tokens": 0
    },
    "seconds": 1,
    "total_tokens": 54
  }
}

Connection methods

Choose the method that matches your model.
ModelConnection method
Qwen3-ASR-Flash-FiletransDashScope asynchronous only
Qwen3-ASR-FlashOpenAI compatible and DashScope synchronous

Supported audio formats

Qwen3-ASR-Flash accepts Base64-encoded audio or publicly accessible URLs.

Base64-encoded audio input

Use the Data URL format: data:<mediatype>;base64,<data>.
  • <mediatype>: The MIME type. For example, WAV: audio/wav, MP3: audio/mpeg.
  • <data>: The Base64-encoded string. Encoding increases file size. Keep the encoded audio within the 10 MB limit.
Example: data:audio/wav;base64,SUQzBAAAAAAAI1RTU0UAAAAPAAADTGF2ZjU4LjI5LjEwMAAAAAAAAAAAAAAA//PAxABQ/BXRbMPe4IQAhl9
  • Python
  • Java
import base64, pathlib

# Replace "input.mp3" with your audio file path
file_path = pathlib.Path("input.mp3")
base64_str = base64.b64encode(file_path.read_bytes()).decode()
data_uri = f"data:audio/mpeg;base64,{base64_str}"
asr_options is non-standard. With the OpenAI SDK, pass it via extra_body.

Authorizations

string
header
required

DashScope API key. Get your API key from Qwen Cloud console.

Body

application/json
string
required

The model name. Only applicable to Qwen3-ASR-Flash.

object[]
required

The list of messages.

object

Specifies whether to enable certain features. Not a standard OpenAI parameter — pass it through extra_body when using an OpenAI SDK.

boolean
defaultfalse

Specifies whether to use streaming output. We recommend setting this to true to improve responsiveness and reduce the risk of timeouts.

object

Configuration for streaming output. Takes effect only when stream is true.

Response

200-application/json
string

The unique identifier for this call.

chatcmpl-487abe5f-d4f2-9363-a877-xxxxxxx
object[]

The output information of the model.

[
  {
    "finish_reason": "stop",
    "index": 0,
    "message": {
      "annotations": [
        {
          "emotion": "neutral",
          "language": "zh",
          "type": "audio_info"
        }
      ],
      "content": "Welcome to Qwen Cloud.",
      "role": "assistant"
    }
  }
]
integer

The UNIX timestamp (in seconds) when the request was created.

1767683986
string

The model used for this request.

qwen3-asr-flash
string

Always chat.completion.

chat.completion
object

Token consumption information.