Skip to main content
Realtime

Fun-ASR WebSocket API

WebSocket connection, headers, and interaction flow for Fun-ASR real-time speech recognition

Connect to the Fun-ASR real-time speech recognition service over WebSocket. This topic covers the service endpoint, request headers, and interaction flow. User guide: For model details and selection guidance, see Speech-to-text models. For sample code, see Realtime speech recognition. The DashScope SDK supports Java and Python only. For other languages, connect to the service directly over WebSocket.

Getting started

Sample code

  • Node.js
  • C#
  • PHP
  • Go
Install dependencies:
npm install ws
npm install uuid
Sample code:
const fs = require('fs');
const WebSocket = require('ws');
const { v4: uuidv4 } = require('uuid'); // Used to generate a UUID

// If you have not configured environment variables, replace the following line with your API key: const apiKey = "sk-xxx"
const apiKey = process.env.DASHSCOPE_API_KEY;
const url = 'wss://dashscope-intl.aliyuncs.com/api-ws/v1/inference/'; // WebSocket server address
const audioFile = 'asr_example.wav'; // Replace with the path to your audio file

// Generate a 32-digit random ID
const TASK_ID = uuidv4().replace(/-/g, '').slice(0, 32);

// Create a WebSocket client
const ws = new WebSocket(url, {
  headers: {
    Authorization: `bearer ${apiKey}`
  }
});

let taskStarted = false; // A flag that indicates whether the task has started

// Send the run-task instruction when the connection is opened
ws.on('open', () => {
  console.log('Connected to the server');
  sendRunTask();
});

// Process received messages
ws.on('message', (data) => {
  const message = JSON.parse(data);
  switch (message.header.event) {
    case 'task-started':
      console.log('The task has started');
      taskStarted = true;
      sendAudioStream();
      break;
    case 'result-generated':
      console.log('Recognition result:', message.payload.output.sentence.text);
      if (message.payload.usage) {
        console.log('Billable duration of the task (in seconds):', message.payload.usage.duration);
      }
      break;
    case 'task-finished':
      console.log('The task is complete');
      ws.close();
      break;
    case 'task-failed':
      console.error('The task failed:', message.header.error_message);
      ws.close();
      break;
    default:
      console.log('Unknown event:', message.header.event);
  }
});

// If the task-started event is not received, close the connection
ws.on('close', () => {
  if (!taskStarted) {
    console.error('The task did not start. Closing the connection.');
  }
});

// Send the run-task instruction
function sendRunTask() {
  const runTaskMessage = {
    header: {
      action: 'run-task',
      task_id: TASK_ID,
      streaming: 'duplex'
    },
    payload: {
      task_group: 'audio',
      task: 'asr',
      function: 'recognition',
      model: 'fun-asr-realtime',
      parameters: {
        sample_rate: 16000,
        format: 'wav'
      },
      input: {}
    }
  };
  ws.send(JSON.stringify(runTaskMessage));
}

// Send the audio stream
function sendAudioStream() {
  const audioStream = fs.createReadStream(audioFile);
  let chunkCount = 0;

  function sendNextChunk() {
    const chunk = audioStream.read();
    if (chunk) {
      ws.send(chunk);
      chunkCount++;
      setTimeout(sendNextChunk, 100); // Send a chunk every 100 ms
    }
  }

  audioStream.on('readable', () => {
    sendNextChunk();
  });

  audioStream.on('end', () => {
    console.log('The audio stream has ended');
    sendFinishTask();
  });

  audioStream.on('error', (err) => {
    console.error('Error reading the audio file:', err);
    ws.close();
  });
}

// Send the finish-task instruction
function sendFinishTask() {
  const finishTaskMessage = {
    header: {
      action: 'finish-task',
      task_id: TASK_ID,
      streaming: 'duplex'
    },
    payload: {
      input: {}
    }
  };
  ws.send(JSON.stringify(finishTaskMessage));
}

// Handle errors
ws.on('error', (error) => {
  console.error('WebSocket error:', error);
});

Service endpoint

Use the following WebSocket URL:
wss://dashscope-intl.aliyuncs.com/api-ws/v1/inference
The URL must use the wss:// protocol. Provide your API key in the Authorization request header (see Request headers).

Request headers

Include the following headers in your request:
ParameterTypeRequiredDescription
AuthorizationstringYesAuthentication token in the format Bearer $DASHSCOPE_API_KEY. Replace with your API key.
user-agentstringNoClient identifier that the server uses to trace the request origin.
X-DashScope-WorkSpacestringNoQwen Cloud workspace ID.
X-DashScope-DataInspectionstringNoWhether to enable data inspection. Omit this header by default; set it to enable only when required.
The Authorization header is validated during the WebSocket handshake. If the API key is invalid or missing, the handshake fails with HTTP 401 or 403.

Interaction flow

For detailed descriptions of client events and server events, see Client events and Server events. The client and server exchange messages in the following sequence:
  1. Establish a connection: The client opens a WebSocket connection to the server.
  2. Start the task: The client sends a run-task directive and waits for the server's task-started event, which indicates that the task is ready and the client can proceed.
  3. Send the audio stream: The client streams binary audio (mono only) while continuously receiving result-generated events from the server. Each event contains speech recognition results.
  4. Signal task completion: The client sends a finish-task directive to instruct the server to end the task, and continues to receive result-generated events.
  5. End the task: The client receives a task-finished event from the server, which signals that the task has ended.
  6. Close the connection: The client closes the WebSocket connection.

Connection reuse

You can reuse a WebSocket connection across tasks. After the server returns a task-finished event, send another run-task directive on the same connection.
  1. Each task on a reused connection must have a unique task_id.
  2. Failed tasks trigger a task-failed event and close the connection (no reuse).
  3. Connections time out after 60 seconds of inactivity.
Fun-ASR WebSocket API - Qwen Cloud