Skip to main content
Realtime

Qwen-TTS client events

WebSocket client reference

Client events are JSON messages you send over a WebSocket to configure voice settings, stream text, and signal input completion.
For the full API overview, see Realtime streaming TTS.

Endpoint

Connect to the WebSocket endpoint:
wss://dashscope-intl.aliyuncs.com/api-ws/v1/realtime?model={model}
Replace {model} with your model ID, such as qwen3-tts-instruct-flash-realtime.

Event summary

Client eventServer responseDescription
session.updatesession.updatedSet voice, audio format, mode, and other session parameters
input_text_buffer.appendresponse.createdAppend text to the synthesis buffer
input_text_buffer.commitinput_text_buffer.committedCommit buffered text to start synthesis
input_text_buffer.clearinput_text_buffer.clearedDiscard all buffered text
session.finishsession.finishedEnd the session; the server flushes remaining audio and closes the connection

session.update

Send as the first message after the WebSocket opens. Omit to use defaults. The server responds with session.updated.
Example
{
  "event_id": "event_123",
  "type": "session.update",
  "session": {
    "voice": "Cherry",
    "mode": "server_commit",
    "language_type": "Chinese",
    "response_format": "pcm",
    "sample_rate": 24000,
    "instructions": "",
    "optimize_instructions": false
  }
}
string
body
required
Unique identifier for this event. Use a UUID. Must be unique within the session.
string
body
required
Set to session.update.
object
body
Session configuration.

input_text_buffer.append

Appends text to the synthesis buffer. In server_commit mode, the buffer is server-side; in commit mode, it is client-side. The server responds with response.created when a new response begins.
Example
{
  "event_id": "event_B4o9RHSTWobB5OQdEHLTo",
  "type": "input_text_buffer.append",
  "text": "Hello, I am Qwen."
}
string
body
required
Unique identifier for this event. Use a UUID. Must be unique within the session.
string
body
required
Set to input_text_buffer.append.
string
body
required
Text to synthesize.

input_text_buffer.commit

Commits buffered text and creates a user message item. The server responds with input_text_buffer.committed. Sending this on an empty buffer returns an error. Behavior by mode:
  • server_commit: All buffered text is synthesized immediately. The server stops caching and processes everything.
  • commit: Creates a user message item from the buffered text.
Committing triggers speech synthesis only -- not model response generation.
Example
{
  "event_id": "event_B4o9RHSTWobB5OQdEHLTo",
  "type": "input_text_buffer.commit"
}
string
body
required
Unique identifier for this event. Use a UUID. Must be unique within the session.
string
body
required
Set to input_text_buffer.commit.

input_text_buffer.clear

Clears all text in the buffer. The server responds with input_text_buffer.cleared.
Example
{
  "event_id": "event_2728",
  "type": "input_text_buffer.clear"
}
string
body
required
Unique identifier for this event. Use a UUID. Must be unique within the session.
string
body
required
Set to input_text_buffer.clear.

session.finish

Signals that you have no more text to send. The server flushes remaining audio, returns session.finished, and closes the connection.
Example
{
  "event_id": "event_2239",
  "type": "session.finish"
}
string
body
required
Unique identifier for this event. Use a UUID. Must be unique within the session.
string
body
required
Set to session.finish.
Qwen-TTS client events | Qwen Cloud