Skip to main content
Qwen-Omni-Realtime

Qwen-Omni client events

WebSocket client reference

Events sent from the client to the server over WebSocket.
For non-realtime usage, Qwen-Omni is available through the Chat API.

session.update

Send this event after connecting to update the session configuration. The service validates your parameters and returns the full configuration or an error.
Example
{
  "event_id": "event_ToPZqeobitzUJnt3QqtWg",
  "type": "session.update",
  "session": {
  "modalities": ["text", "audio"],
  "voice": "Chelsie",
  "input_audio_format": "pcm16",
  "output_audio_format": "pcm24",
  "instructions": "You are an AI customer service agent for a five-star hotel. Please answer customer inquiries about room types, facilities, prices, and reservation policies accurately and in a friendly manner. Always respond with a professional and helpful attitude. Do not provide unconfirmed information or information beyond the scope of the hotel's services.",
  "turn_detection": {
      "type": "server_vad",
      "threshold": 0.5,
      "silence_duration_ms": 800
  },
  "seed": 1314,
  "max_tokens": 16384,
  "repetition_penalty": 1.05,
  "presence_penalty": 0.0,
  "top_k": 50,
  "top_p": 1.0,
  "temperature": 0.9
  }
}
string
body
required
Event type. Always session.update.
object
body
Session configuration.

response.create

Tells the service to generate a model response. In VAD mode, responses are automatic and you do not need this event. The service responds with response.created, then item and content events (conversation.item.created, response.content_part.added), and finally response.done.
Example
{
  "type": "response.create",
  "event_id": "event_1718624400000"
}
string
body
required
Event type. Always response.create.

response.cancel

Cancels an ongoing response. Returns an error if no response is in progress.
Example
{
  "event_id": "event_B4o9RHSTWobB5OQdEHLTo",
  "type": "response.cancel"
}
string
body
required
Event type. Always response.cancel.

input_audio_buffer.append

Appends audio bytes to the input buffer.
Example
{
  "event_id": "event_B4o9RHSTWobB5OQdEHLTo",
  "type": "input_audio_buffer.append",
  "audio": "UklGR..."
}
string
body
required
Event type. Always input_audio_buffer.append.
string
body
required
The Base64-encoded audio data.

input_audio_buffer.commit

Submits the input audio buffer as a user message. Returns an error if the buffer is empty.
  • VAD mode: Automatic. You do not need this event.
  • Manual mode: Required to create a user message.
Submitting the buffer does not trigger a model response. The service responds with input_audio_buffer.committed.
If you have sent an input_image_buffer.append event, input_audio_buffer.commit submits the image buffer along with the audio buffer.
Example
{
  "event_id": "event_B4o9RHSTWobB5OQdEHLTo",
  "type": "input_audio_buffer.commit"
}
string
body
required
Event type. Always input_audio_buffer.commit.

input_audio_buffer.clear

Clears the audio buffer. The service responds with input_audio_buffer.cleared.
Example
{
  "event_id": "event_xxx",
  "type": "input_audio_buffer.clear"
}
string
body
required
Event type. Always input_audio_buffer.clear.

input_image_buffer.append

Adds image data to the image buffer from local files or video streams. Limits:
  • Format: JPG or JPEG. Recommended: 480p or 720p. Maximum: 1080p.
  • Size: ≤500 KB before Base64 encoding.
  • Encoding: Base64.
  • Frequency: 1 image per second.
  • Prerequisite: Send at least one input_audio_buffer.append event first.
The image buffer is submitted with the audio buffer through the input_audio_buffer.commit event.
Example
{
  "event_id": "event_xxx",
  "type": "input_image_buffer.append",
  "image": "xxx"
}
string
body
required
Event type. Always input_image_buffer.append.
string
body
required
The Base64-encoded image data.