LiveTranslate Java SDK

User guide: For tutorials and complete examples, see Real-time translation.

Configuration overview

Three builder objects control a translation session:

OmniRealtimeParam          --> Connection: model, endpoint, API key
  +-- OmniRealtimeConfig   --> Session: audio formats, voice, modalities
       +-- OmniRealtimeTranslationParam  --> Translation: target language, custom terminology

Pass OmniRealtimeParam to the constructor. After connecting, call updateSession() with OmniRealtimeConfig to set audio and translation options. Defaults apply if you skip updateSession().

Request parameters

OmniRealtimeParam

Build connection parameters with OmniRealtimeParam.builder().

Sample code

OmniRealtimeParam param = OmniRealtimeParam.builder()
  .model("qwen3-livetranslate-flash-realtime")
  .url("wss://dashscope-intl.aliyuncs.com/api-ws/v1/realtime")
  // If no environment variable is set, replace the next line with your API key: .apikey("YOUR_API_KEY")
  .apikey(System.getenv("DASHSCOPE_API_KEY"))
  .build();

Parameter	Type	Required	Description
`model`	`String`	Yes	Model name. Use `qwen3-livetranslate-flash-realtime`.
`url`	`String`	Yes	WebSocket endpoint. Use `wss://dashscope-intl.aliyuncs.com/api-ws/v1/realtime`.
`apikey`	`String`	No	API key. Defaults to the `DASHSCOPE_API_KEY` environment variable.

OmniRealtimeConfig

Build session parameters with OmniRealtimeConfig.builder(), then call conversation.updateSession(config).

Sample code

// Set custom translation phrases
Map<String, Object> phrases = new HashMap<>();
phrases.put("Inteligencia Artificial", "Artificial Intelligence");
phrases.put("Aprendizaje Automático", "Machine Learning");

OmniRealtimeConfig config = OmniRealtimeConfig.builder()
  .modalities(Arrays.asList(OmniRealtimeModality.AUDIO, OmniRealtimeModality.TEXT))
  .voice("Cherry")
  .inputAudioFormat(OmniRealtimeAudioFormat.PCM_16000HZ_MONO_16BIT)
  .outputAudioFormat(OmniRealtimeAudioFormat.PCM_24000HZ_MONO_16BIT)
  .InputAudioTranscription("qwen3-asr-flash-realtime")
  .translationConfig(OmniRealtimeTranslationParam.builder()
    .language("en")
    .corpus(OmniRealtimeTranslationParam.Corpus.builder()
      .phrases(phrases)
      .build())
    .build())
  .build();

conversation.updateSession(config);

Parameter	Type	Required	Description
`modalities`	`List<OmniRealtimeModality>`	No	Output modalities. Default: `[AUDIO, TEXT]`. Set `[TEXT]` for text only.
`voice`	`String`	No	Voice for synthesized speech. Default: `Cherry`. See supported voices.
`inputAudioFormat`	`OmniRealtimeAudioFormat`	No	Input audio format. Default: `PCM_16000HZ_MONO_16BIT`.
`outputAudioFormat`	`OmniRealtimeAudioFormat`	No	Output audio format. Default: `PCM_24000HZ_MONO_16BIT`.
`InputAudioTranscription`	`String`	No	ASR model for transcribing input speech. Set to `qwen3-asr-flash-realtime` to receive source-language transcription with translation.
`translationConfig`	`OmniRealtimeTranslationParam`	No	Translation settings. See OmniRealtimeTranslationParam below.

OmniRealtimeTranslationParam

Build translation parameters with OmniRealtimeTranslationParam.builder().

Sample code

// Set translation phrases
Map<String, Object> phrases = new HashMap<>();
phrases.put("Inteligencia Artificial", "Artificial Intelligence");  // Source language word: Target language translation
phrases.put("Aprendizaje Automático", "Machine Learning");

OmniRealtimeTranslationParam translationParam = OmniRealtimeTranslationParam.builder()
  .language("en")  // Target language code
  .corpus(OmniRealtimeTranslationParam.Corpus.builder()
    .phrases(phrases)
    .build())
  .build();

Parameter	Type	Required	Description
`language`	`String`	No	Target language code. Default: `en`. See supported languages.
`corpus`	`Corpus`	No	Custom terminology for domain-specific terms.
`corpus.phrases`	`Map<String, Object>`	No	Term mappings. Keys: source terms; values: target translations. Example: `{"Inteligencia Artificial": "Artificial Intelligence"}`

Key interfaces

OmniRealtimeConversation

Manages the WebSocket connection and audio streaming. Import: com.alibaba.dashscope.audio.omni.OmniRealtimeConversation

Method	Description
`OmniRealtimeConversation(OmniRealtimeParam param, OmniRealtimeCallback callback)`	Creates a conversation with connection parameters and an event callback.
`void connect()`	Opens the WebSocket connection. Triggers session.created and session.updated. Throws `NoApiKeyException`, `InterruptedException`.
`void updateSession(OmniRealtimeConfig config)`	Updates session configuration. Triggers session.updated. Omitted parameters use defaults.
`void appendAudio(String audioBase64)`	Sends a Base64-encoded audio chunk. The server detects speech boundaries and triggers translation automatically.
`void endSession()`	Ends the session. The server finishes in-progress translations before sending session.finished. Throws `InterruptedException`.
`void close(int code, String reason)`	Stops the task and closes the WebSocket connection.
`String getSessionId()`	Returns the session ID.
`String getResponseId()`	Returns the response ID of the latest server response.
`long getFirstTextDelay()`	Returns the first text delay of the latest response in milliseconds.
`long getFirstAudioDelay()`	Returns the first audio delay of the latest response in milliseconds.

OmniRealtimeCallback

Handles server events over WebSocket. Extend this class and implement each method. Import: com.alibaba.dashscope.audio.omni.OmniRealtimeCallback

Method	Parameters	Description
`void onOpen()`	None	Called when the WebSocket connection opens.
`abstract void onEvent(JsonObject message)`	`message`: A JSON object containing a server event.	Called for each server event. Parse the `type` field to identify the event.
`abstract void onClose(int code, String reason)`	`code`: WebSocket status code. `reason`: Closure description.	Called when the WebSocket closes.

Common event types in onEvent:

Event type	Description
`input_audio_buffer.speech_started`	Speech detected in the audio stream.
`input_audio_buffer.speech_stopped`	End of a speech segment detected.
`conversation.item.input_audio_transcription.completed`	Source-language transcription ready. Read `message.get("transcript")`. Requires `InputAudioTranscription`.
`response.audio_transcript.done`	Translated text ready. Read `message.get("transcript")`.
`response.audio.delta`	Translated audio chunk available. Read `message.get("delta")` for Base64-encoded audio.
`error`	An error occurred. Read `message.get("error").getAsJsonObject().get("message")` for details.

References

Qwen-LiveTranslate model overview -- Supported languages, voices, and capabilities
Server events reference -- Event types, JSON schemas, and error codes
Install the DashScope SDK -- Installation and dependency setup
Get an API key -- API key creation and management

​Configuration overview

​Request parameters

​OmniRealtimeParam

​OmniRealtimeConfig

​OmniRealtimeTranslationParam

​Key interfaces

​OmniRealtimeConversation

​OmniRealtimeCallback

​References

Configuration overview

Request parameters

OmniRealtimeParam

OmniRealtimeConfig

OmniRealtimeTranslationParam

Key interfaces

OmniRealtimeConversation

OmniRealtimeCallback

References