WebSocket client reference
Events sent from the client to the server over WebSocket.
Send this event after connecting to update the session configuration. The service validates your parameters and returns the full configuration or an error.
Tells the service to generate a model response. In VAD mode, responses are automatic and you do not need this event.
The service responds with
Cancels an ongoing response. Returns an error if no response is in progress.
Appends audio bytes to the input buffer.
Submits the input audio buffer as a user message. Returns an error if the buffer is empty.
Clears the audio buffer. The service responds with
Adds image data to the image buffer from local files or video streams.
Limits:
For non-realtime usage, Qwen-Omni is available through the Chat API.
Reference: Real-time multimodal.
session.update
Send this event after connecting to update the session configuration. The service validates your parameters and returns the full configuration or an error.
Example
string
body
required
Event type. Always
session.update.object
body
Session configuration.
response.create
Tells the service to generate a model response. In VAD mode, responses are automatic and you do not need this event.
The service responds with response.created, then item and content events (conversation.item.created, response.content_part.added), and finally response.done.
Example
string
body
required
Event type. Always
response.create.response.cancel
Cancels an ongoing response. Returns an error if no response is in progress.
Example
string
body
required
Event type. Always
response.cancel.input_audio_buffer.append
Appends audio bytes to the input buffer.
Example
string
body
required
Event type. Always
input_audio_buffer.append.string
body
required
The Base64-encoded audio data.
input_audio_buffer.commit
Submits the input audio buffer as a user message. Returns an error if the buffer is empty.
- VAD mode: Automatic. You do not need this event.
- Manual mode: Required to create a user message.
input_audio_buffer.committed.
If you have sent an input_image_buffer.append event, input_audio_buffer.commit submits the image buffer along with the audio buffer.
Example
string
body
required
Event type. Always
input_audio_buffer.commit.input_audio_buffer.clear
Clears the audio buffer. The service responds with input_audio_buffer.cleared.
Example
string
body
required
Event type. Always
input_audio_buffer.clear.input_image_buffer.append
Adds image data to the image buffer from local files or video streams.
Limits:
- Format: JPG or JPEG. Recommended: 480p or 720p. Maximum: 1080p.
- Size: ≤500 KB before Base64 encoding.
- Encoding: Base64.
- Frequency: 1 image per second.
-
Prerequisite: Send at least one
input_audio_buffer.appendevent first.
The image buffer is submitted with the audio buffer through the input_audio_buffer.commit event.
Example
string
body
required
Event type. Always
input_image_buffer.append.string
body
required
The Base64-encoded image data.