CosyVoice server-side events

User guide: For model introduction and selection recommendations, see Speech synthesis.

task-started

After the client sends the run-task command, the server returns a task-started event to signal that the task has started. The client can send subsequent commands only after receiving this event.

Example

{
  "header": {
    "task_id": "2bf83b9a-baeb-4fda-8d9a-xxxxxxxxxxxx",
    "event": "task-started",
    "attributes": {}
  },
  "payload": {}
}

Field	Type	Description
header.task_id	string	The task ID generated by the client.
header.event	string	Event type. Fixed value: `task-started`.
payload	object	Empty object.

result-generated

After the client sends text, the server continuously returns result-generated events. Each event carries sentence-level metadata.

sentence-begin
sentence-synthesis
sentence-end

{
  "header": {
    "task_id": "3f2d5c86-0550-45c0-801f-xxxxxxxxxx",
    "event": "result-generated",
    "attributes": {}
  },
  "payload": {
    "output": {
      "sentence": {
        "index": 0,
        "words": []
      },
      "type": "sentence-begin",
      "original_text": "Before my bed, moonlight shines bright,"
    }
  }
}

{
    "header": {
        "task_id": "3f2d5c86-0550-45c0-801f-xxxxxxxxxx",
        "event": "result-generated",
        "attributes": {}
    },
    "payload": {
        "output": {
            "sentence": {
                "index": 0,
                "words": []
            },
            "type": "sentence-synthesis"
        }
    }
}

{
  "header": {
    "task_id": "3f2d5c86-0550-45c0-801f-xxxxxxxxxx",
    "event": "result-generated",
    "attributes": {}
  },
  "payload": {
    "output": {
      "sentence": {
        "index": 0,
        "words": [
          {
            "text": "Before",
            "begin_index": 0,
            "end_index": 1,
            "begin_time": 0,
            "end_time": 263
          }
        ]
      },
      "type": "sentence-end",
      "original_text": "Before my bed, moonlight shines bright,"
    },
    "usage": {
      "characters": 6
    }
  }
}

Field	Type	Description
header.task_id	string	The task ID generated by the client.
header.event	string	Event type. Fixed value: `result-generated`.
payload.output.type	string	Sub-event type. Valid values: `sentence-begin` (sentence start, returns the text to be synthesized), `sentence-synthesis` (marks an audio frame, one audio frame is transmitted over the WebSocket binary channel immediately after each event), `sentence-end` (sentence end, returns the text content and cumulative character count).
payload.output.sentence.index	integer	Sentence index, starting from 0.
payload.output.sentence.words	array	Word-level timestamp array.
payload.output.sentence.words[].text	string	Text content of the word.
payload.output.sentence.words[].begin_index	integer	Start character index of the word within the sentence. Starts at 0.
payload.output.sentence.words[].end_index	integer	End character index of the word within the sentence. Starts at 1.
payload.output.sentence.words[].begin_time	integer	Start time of the word's corresponding audio, in milliseconds.
payload.output.sentence.words[].end_time	integer	End time of the word's corresponding audio, in milliseconds.
payload.output.original_text	string	Text of the sentence as segmented for synthesis.
payload.usage.characters	integer	Cumulative number of billed characters (returned in the sentence-end event).

task-finished

The server returns a task-finished event when the task completes. The client can then close the WebSocket connection or reuse it to start a new task.

Example

{
  "header": {
    "task_id": "2bf83b9a-baeb-4fda-8d9a-xxxxxxxxxxxx",
    "event": "task-finished",
    "attributes": {
      "request_uuid": "0a9dba9e-d3a6-45a4-be6d-xxxxxxxxxxxx"
    }
  },
  "payload": {
    "usage": {
      "characters": 13
    }
  }
}

Field	Type	Description
header.task_id	string	The task ID generated by the client.
header.event	string	Event type. Fixed value: `task-finished`.
payload.usage.characters	integer	Cumulative number of billed characters.

task-failed

The server returns a task-failed event when the task fails. On receiving this event, the client must close the WebSocket connection and handle the error.

Example

{
  "header": {
    "task_id": "2bf83b9a-baeb-4fda-8d9a-xxxxxxxxxxxx",
    "event": "task-failed",
    "error_code": "InvalidParameter",
    "error_message": "[tts:]Engine return error code: 418",
    "attributes": {}
  },
  "payload": {}
}

Field	Type	Description
header.task_id	string	The task ID generated by the client.
header.event	string	Event type. Fixed value: `task-failed`.
header.error_code	string	Error code.
header.error_message	string	Detailed error message.

​task-started

​result-generated

​task-finished

​task-failed

task-started

result-generated

task-finished

task-failed