Text + image/audio input
Getting started
Prerequisites
- Get an API key and set it as an environment variable.
- Qwen-Omni supports only OpenAI-compatible calls. Install the SDK. The OpenAI Python SDK requires version 1.52.0+. The Node.js SDK requires version 4.68.0+.
Response
Response
After you run the Running the
Python or Node.js code, the text response is returned and an audio file named audio_assistant.wav is saved in the same directory as your code file.HTTP code returns text and Base64-encoded audio data directly in the audio field.Model selection
The Qwen3.5-Omni series models are currently in invitational preview. Model calls are free for a limited time. This does not include tool calling fees. For tool calling billing, see Pricing.
-
Qwen3.5-Omni series: Best for long video analysis, meeting summaries, caption generation, content moderation, and audio-video interaction.
- Input limits: Up to 3 hours of audio or up to 1 hour of video
- Audio control: Supports adjusting volume, speaking rate, and emotion via instructions
- Visual capability: Matches Qwen3.5's level. Understands images, speech, sound effects, and other multimodal information
-
Qwen3-Omni-Flash series: Best for short video analysis and cost-sensitive scenarios.
- Input limits: Audio-video input under 150 seconds
- Thinking mode: The only Qwen-Omni series that supports thinking mode
- Qwen-Omni-Turbo series: This series is no longer updated and has limited features. Migrate to the Qwen3.5-Omni or Qwen3-Omni-Flash series.
| Series | Audio-video description | Deep thinking | Web search | Input languages | Output audio languages | Voices |
|---|---|---|---|---|---|---|
| Qwen3.5-Omni | Strong | Not supported | Supported | 113 types (74 languages + 39 dialects) | 36 types (29 languages + 7 dialects) | 55 |
| Qwen3-Omni-Flash | Weaker | Supported | Not supported | 19 types (11 languages + 8 dialects) | 19 types (11 languages + 8 dialects) | 17-49 (varies by version) |
| Qwen-Omni-Turbo (No longer updated) | None | Not supported | Not supported | Chinese, English | Chinese, English | 4 |
Qwen3.5-Omni supported languages
Qwen3.5-Omni supported languages
Input languages (74 languages): Chinese, English, German, French, Italian, Czech, Indonesian, Thai, Korean, Polish, Japanese, Vietnamese, Finnish, Portuguese, Spanish, Dutch, Russian, Malay, Catalan, Swedish, Turkish, Ukrainian, Romanian, Slovak, Danish, Icelandic, Norwegian (Bokmal), Macedonian, Greek, Hungarian, Galician, Filipino, Croatian, Bosnian, Slovenian, Bulgarian, Kazakh, Belarusian, Latvian, Estonian, Azerbaijani, Uyghur, Swahili, Hindi, Esperanto, Kyrgyz, Tajik, Cebuano, Afrikaans, Arabic, Lithuanian, Javanese, Bengali, Persian, Hebrew, Punjabi, Gujarati, Mongolian, Asturian, Kannada, Marathi, Interlingua, Malayalam, Maltese, Norwegian Nynorsk, Telugu, Urdu, Georgian, Basque, Tamil, Odia, Serbian, MaoriInput dialects (39 dialects): Northeastern Mandarin, Guizhou dialect, Cantonese, Henan dialect, Hong Kong Cantonese, Shanghainese, Shaanxi dialect, Tianjin dialect, Taiwanese Hokkien, Yunnan dialect, Anhui dialect, Fujian dialect, Gansu dialect, Guangdong dialect, Hubei dialect, Hunan dialect, Jiangxi dialect, Shandong dialect, Shanxi dialect, Sichuan dialect, Guangxi dialect, Hainan dialect, Chongqing dialect, Changsha dialect, Hangzhou dialect, Hefei dialect, Yinchuan dialect, Zhengzhou dialect, Shenyang dialect, Wenzhou dialect, Wuhan dialect, Kunming dialect, Taiyuan dialect, Nanchang dialect, Jinan dialect, Lanzhou dialect, Nanjing dialect, Hakka, Southern MinOutput audio languages (29 languages): Chinese, English, German, Italian, Portuguese, Spanish, Japanese, Korean, French, Russian, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, PersianOutput audio dialects (7 dialects): Sichuan dialect, Beijing dialect, Tianjin dialect, Nanjing dialect, Shaanxi dialect, Cantonese, Southern Min
Qwen3-Omni-Flash supported languages
Qwen3-Omni-Flash supported languages
Input/Output languages (11 languages): Chinese, English, German, French, Italian, Thai, Korean, Japanese, Russian, Spanish, PortugueseInput/Output dialects (8 dialects): Sichuan dialect, Shanghainese, Cantonese, Southern Min, Shaanxi dialect, Nanjing dialect, Tianjin dialect, Beijing dialect
Model performance
Audio and video content analysis
| Input | Output |
|---|---|
| Generate a comprehensive, timestamped description of this video. | 00:00.000 - 00:02.500 A rain-soaked city street fills the widescreen frame. Long-exposure photography creates streaks of red and blue car lights on wet pavement. A solitary man in a dark knee-length trench coat and light shirt with a tie walks toward the camera along the right sidewalk.00:12.300 - 00:14.533 The camera tilts down. Inside a flashlight beam, a yellowed, aged single sheet lies in a shallow puddle, edges charred and corners curled. The man kneels, extending gloved fingers toward the document; ripples spread outward as his hand nears.Only partial timestamps are shown. |
Multidialect understanding
| Input | Output |
|---|---|
| Describe the vocal characteristics, emotional state, and full lyrics for singers in different Chinese dialects. | This audio features rap performances by singers from various regions of China. Hangzhou dialect (Wu) -- The male singer's voice is soft and gentle, with the distinctive smoothness and warmth of Wu dialect. His intonation rises and falls gently, and his rhythm is light and quick, creating a relaxed storytelling effect. Emotional state: Relaxed, comfortable, and full of everyday life. He describes Hangzhou's slow-paced lifestyle and urban changes, expressing affection for his hometown.Only partial results are shown. |
Lyric caption generation
| Input | Output |
|---|---|
| Transcribe the song lyrics and provide timestamps for each line. | [00:00:12,680 --> 00:00:16,960] Cat thread sways past moonlight on trees. [00:00:18,400 --> 00:00:22,800] Radiators hum 1998 chart hits. [00:00:24,160 --> 00:00:28,080] Time parts the mist-like heat waves. [00:00:28,920 --> 00:00:33,000] Neon from the screen shines on my nose bridge. ... [00:04:09,000 --> 00:04:10,020] (End)Only partial results are shown. |
Audio-video programming
| Demo 1 | Demo 2 |
|---|---|
Usage
Streaming output
All requests to Qwen-Omni must set stream=True.
Model configuration
Configure parameters, prompts, and audio-video lengths based on your use case to balance cost, speed, and quality.
- Audio-video understanding
- Audio understanding
| Use case | Recommended video length | Recommended prompt | Recommended max_pixels value |
|---|---|---|---|
| Fast review, low cost | ≤60 minutes | Simple prompt within 50 words | 230,400 |
| Content extraction (long video segmentation) | ≤60 minutes | Simple prompt within 50 words | 921,600 to 2,073,600 |
| Standard analysis (short video tagging) | ≤4 minutes | Use the structured prompt below | 921,600 to 2,073,600 |
| Fine-grained analysis (multiple speakers/complex scenes) | ≤2 minutes | Use the structured prompt below | 2,073,600 |
Recommended structured prompt for audio-video understanding
Recommended structured prompt for audio-video understanding
For fine-grained descriptions of long videos, segment them first.
Thinking mode
For enable/disable, streaming output, and
thinking_budget, see Thinking.enable_thinking defaults to false). Qwen-Omni-Turbo does not support thinking.
In thinking mode, set modalities: ["text"] — audio output is not supported when thinking is enabled.
Web search
The Qwen3.5-Omni series supports web search to retrieve real-time information and perform reasoning. Enable web search using the enable_search parameter and set search_strategy to agent.
- Web search is supported only in the Qwen3.5-Omni series. The
search_strategyparameter only acceptsagent. - See Pricing for billing information related to the
agentstrategy.
Multimodal input
Video and text input
You can input video as an image list or as a video file. If you input a video file, the model can also understand the audio in the video.
The following sample code uses a video URL from the internet as an example. To input a local video, see Input Base64-encoded local files. Streaming output is required for all calls.
Video file format (can understand audio in the video)
- Number of files:
- Qwen3.5-Omni series: Up to 512 files using public URLs; up to 250 files using Base64 encoding.
- Qwen3-Omni-Flash and Qwen-Omni-Turbo series: Only one file allowed.
- File size:
- Qwen3.5-Omni: Up to 2 GB, up to 1 hour duration.
- Qwen3-Omni-Flash: Up to 256 MB, up to 150 seconds duration.
- Qwen-Omni-Turbo: Up to 150 MB, up to 40 seconds duration.
- File formats: MP4, AVI, MKV, MOV, FLV, WMV, etc.
- Visual and audio information in video files are billed separately.
Image list format
Number of images
- Qwen3.5-Omni: Minimum 2 images, maximum 2048 images.
- Qwen3-Omni-Flash: Minimum 2 images, maximum 128 images.
- Qwen-Omni-Turbo: Minimum 4 images, maximum 80 images.
Audio and text input
- Number of files:
- Qwen3.5-Omni series: Up to 2048 files using public URLs; up to 250 files using Base64 encoding.
- Qwen3-Omni-Flash and Qwen-Omni-Turbo series: Only one file allowed.
- File size:
- Qwen3.5-Omni: Up to 2 GB, up to 3 hours duration.
- Qwen3-Omni-Flash: Up to 100 MB, up to 20 minutes duration.
- Qwen-Omni-Turbo: Up to 10 MB, up to 3 minutes duration.
- File formats: Supports major formats such as AMR, WAV, 3GP, 3GPP, AAC, and MP3.
Image and text input
Qwen-Omni models support multiple image inputs. The requirements for input images are as follows:
-
Number of images:
- When passed as a public URL: up to 2048 images per request.
- When passed as Base64-encoded strings: up to 250 images per request.
-
Image size:
- Qwen3.5 series: Each image file must be 20 MB or less.
- Qwen3-Omni-Flash and Qwen-Omni-Turbo series: Each image file must be 10 MB or less.
- The width and height of the image must both be greater than 10 pixels. The aspect ratio must not exceed 200:1 or 1:200.
- For supported image types, see Visual and video understanding.
Multi-turn conversation
When you use the multi-turn conversation feature of Qwen-Omni models, note the following:
- Assistant message: Assistant messages in the messages array support only text data.
- User message: A user message can contain text and data from only one other modality. In a multi-turn conversation, you can use different modalities in separate user messages.
Parse Base64-encoded audio data output
The audio output from Qwen-Omni models is Base64-encoded data delivered in a stream. You can use a string variable to accumulate the Base64 data from each fragment as it arrives. After the stream is complete, decode the final string to create the audio file. Alternatively, decode and play each fragment in real time as it is received.
Input Base64-encoded local files
- Images
- Audio
- Video file
- Image list (as video)
This example uses the locally saved file eagle.png.
API reference
For the input and output parameters of Qwen-Omni, see Chat completions API.
Billing and rate limits
Billing rules
Qwen-Omni is billed based on the number of tokens for different modalities, such as audio, image, and video. See Pricing for pricing details.
Rules for converting audio, images, and videos to tokens
Rules for converting audio, images, and videos to tokens
AudioVideoVideo files generate two types of tokens:
Qwen3.5-Omni: Total tokens = Audio duration (in seconds) x 7Qwen3-Omni-Flash: Total tokens = Audio duration (in seconds) x 12.5Qwen-Omni-Turbo: Total tokens = Audio duration (in seconds) x 25. If the audio duration is less than 1 second, it is calculated as 1 second.
Qwen3.5-OmniandQwen3-Omni-Flash: 1 token per32 x 32pixels.Qwen-Omni-Turbo: 1 token per28 x 28pixels.
video_tokens (visual) and audio_tokens (audio).video_tokens
audio_tokens- Qwen3.5-Omni:
Total tokens = Audio duration (in seconds) x 7 - Qwen3-Omni-Flash:
Total tokens = Audio duration (in seconds) x 12.5 - Qwen-Omni-Turbo:
Total tokens = Audio duration (in seconds) x 25 - If the audio duration is less than 1 second, it is calculated as 1 second.
- Qwen3.5-Omni:
Error codes
If a call fails, see Error messages.
Voice list
To use a voice, set the
voice request parameter to the corresponding value in the voice parameter column of the tables below.qwen3.5-omni
| Voice name | voice parameter | Description | Supported languages |
|---|---|---|---|
| Tina | Tina | A voice like warm milk tea -- sweet and cozy, yet sharp when solving problems | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Cindy | Cindy | A sweet-talking young woman from Taiwan | Chinese (Taiwanese accent), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Liora Mira | Liora Mira | A gentle voice that weaves warmth into everyday life | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Sunnybobi | Sunnybobi | A cheerful, socially awkward neighbor girl | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Raymond | Raymond | A clear-voiced, takeout-loving homebody | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Ethan | Ethan | Standard Mandarin with a slight northern accent. Bright, warm, energetic, and youthful | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Theo Calm | Theo Calm | Conveys understanding in silence and healing through words | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Serena | Serena | A gentle young woman | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Harvey | Harvey | A voice that carries the weight of time -- deep, mellow, and scented with coffee and old books | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Maia | Maia | A blend of intellect and gentleness | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Evan | Evan | A college student -- youthful and endearing | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Qiao | Qiao | Not just cute -- she's sweet on the surface and full of personality underneath | Chinese (Taiwanese accent), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Momo | Momo | Playful and mischievous -- here to cheer you up | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Wil | Wil | A young man from Shenzhen who speaks with a Hong Kong-Taiwan accent | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Angel | Angel | Slightly Taiwanese-accented -- and very sweet | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Li Cassian | Li Cassian | Speaks with restraint -- three parts silence, seven parts reading the room | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Mia | Mia | A lifestyle artist who shares slow-living aesthetics through a soothing voice | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Joyner | Joyner | Funny, exaggerated, and down-to-earth | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Gold | Gold | A West Coast Black rapper | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Katerina | Katerina | A mature, commanding voice with rich rhythm and resonance | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Ryan | Ryan | High-energy delivery with strong dramatic presence -- realism meets intensity | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Jennifer | Jennifer | A premium, cinematic-quality American female voice | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Aiden | Aiden | An American young man skilled in cooking | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Mione | Mione | A mature, intelligent British neighbor girl | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Sichuan - Sunny | Sunny | A sweet Sichuan girl who warms your heart | Chinese (Sichuan dialect), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Beijing - Dylan | Dylan | A youth raised in Beijing's hutongs | Chinese (Beijing dialect), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Sichuan - Eric | Eric | A lively Chengdu man from Sichuan | Chinese (Sichuan dialect), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Tianjin - Peter | Peter | A Tianjin-style xiangsheng performer -- professional foil | Chinese (Tianjin dialect), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Joseph Chen | Joseph Chen | A longtime overseas Chinese from Southeast Asia with a warm, nostalgic voice | Chinese (Hokkien), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Shaanxi - Marcus | Marcus | Broad face, few words, sincere heart, deep voice -- the true flavor of Shaanxi | Chinese (Shaanxi dialect), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Nanjing - Li | Li | A grumpy uncle | Chinese (Nanjing dialect), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Cantonese - Rocky | Rocky | A witty and humorous online chat companion | Chinese (Cantonese), English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Sohee | Sohee | A warm, cheerful, emotionally expressive Korean unnie | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Lenn | Lenn | Rational at core, rebellious in detail -- a German youth who wears suits and listens to post-punk | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Ono Anna | Ono Anna | A clever, playful childhood friend | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Sonrisa | Sonrisa | A warm, outgoing Latin American woman | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Bodega | Bodega | A warm, enthusiastic Spanish man | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Emilien | Emilien | A romantic French big brother | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Andre | Andre | A magnetic, natural, and steady male voice | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Radio Gol | Radio Gol | A passionate football commentator who narrates games with poetic flair | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Alek | Alek | Cold like the Russian spirit -- yet warm as wool beneath a coat | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Rizky | Rizky | A young Indonesian man with a distinctive voice | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Roya | Roya | A sporty girl with a free-spirited heart | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Arda | Arda | Neither high nor low -- clean, crisp, and gently warm | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Hana | Hana | A mature Vietnamese woman who loves dogs | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Dolce | Dolce | A laid-back Italian man | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Jakub | Jakub | A charismatic, artistic young man from a Polish town | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Griet | Griet | A mature, artistic Dutch woman | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Eliska | Eliska | Every word carries Central European craftsmanship and warmth | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Marina | Marina | A girl raised in a multicultural city | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Siiri | Siiri | Reserved and gentle -- with a calm, lake-like speaking pace | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Ingrid | Ingrid | A woman from rural Norway | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Sigga | Sigga | An intellectual young woman from an Icelandic town | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Bea | Bea | A sweet Filipino woman who loves coffee | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
| Chloe | Chloe | A Malaysian office worker | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean, Thai, Indonesian, Arabic, Vietnamese, Turkish, Finnish, Polish, Hindi, Dutch, Czech, Urdu, Tagalog, Swedish, Danish, Hebrew, Icelandic, Malay, Norwegian, Persian |
qwen3-omni-flash-2025-12-01
| Voice name | voice parameter | Description | Supported languages |
|---|---|---|---|
| Cherry | Cherry | A sunny, positive, friendly, and natural young woman | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Serena | Serena | A gentle young woman | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Ethan | Ethan | Standard Mandarin with a slight northern accent. Sunny, warm, energetic, and vibrant | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Chelsie | Chelsie | A two-dimensional virtual girlfriend | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Momo | Momo | Playful and mischievous, cheering you up | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Vivian | Vivian | Confident, cute, and slightly feisty | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Moon | Moon | Effortlessly cool Moon White | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Maia | Maia | A blend of intellect and gentleness | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Kai | Kai | A soothing audio spa for your ears | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Nofish | Nofish | A designer who cannot pronounce retroflex sounds | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Bella | Bella | A little girl who drinks but never throws punches when drunk | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Jennifer | Jennifer | A premium, cinematic-quality American English female voice | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Ryan | Ryan | Full of rhythm, bursting with dramatic flair, balancing authenticity and tension | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Katerina | Katerina | A mature-woman voice with rich, memorable rhythm | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Aiden | Aiden | An American English young man skilled in cooking | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Mia | Mia | Gentle as spring water, obedient as fresh snow | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Mochi | Mochi | A clever, quick-witted young adult -- childlike innocence remains, yet wisdom shines through | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Bellona | Bellona | A powerful, clear voice that brings characters to life -- so stirring it makes your blood boil | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Vincent | Vincent | A uniquely raspy, smoky voice -- just one line evokes armies and heroic tales | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Bunny | Bunny | A little girl overflowing with cuteness | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Neil | Neil | A flat baseline intonation with precise, clear pronunciation -- the most professional news anchor | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Elias | Elias | Maintains academic rigor while using storytelling techniques to turn complex knowledge into digestible learning modules | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Arthur | Arthur | A simple, earthy voice steeped in time and tobacco smoke -- slowly unfolding village stories and curiosities | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Nini | Nini | A soft, clingy voice like sweet rice cakes | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Ebona | Ebona | Her whisper is like a rusty key slowly turning in the darkest corner of your mind | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Seren | Seren | A gentle, soothing voice to help you fall asleep faster. Good night, sweet dreams | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Pip | Pip | A playful, mischievous boy full of childlike wonder | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Stella | Stella | Normally a cloyingly sweet, dazed teenage-girl voice -- but when shouting battle cries, she instantly radiates unwavering love and justice | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Bodega | Bodega | A passionate Spanish man | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Sonrisa | Sonrisa | A cheerful, outgoing Latin American woman | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Alek | Alek | Cold like the Russian spirit, yet warm like wool coat lining | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Dolce | Dolce | A laid-back Italian man | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Sohee | Sohee | A warm, cheerful, emotionally expressive Korean unnie | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Lenn | Lenn | Rational at heart, rebellious in detail -- a German youth who wears suits and listens to post-punk | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Emilien | Emilien | A romantic French big brother | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Andre | Andre | A magnetic, natural, and steady male voice | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Shanghai - Jada | Jada | A fast-paced, energetic Shanghai auntie | Shanghainese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Beijing - Dylan | Dylan | A young man raised in Beijing's hutongs | Beijing dialect, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Sichuan - Sunny | Sunny | A Sichuan girl sweet enough to melt your heart | Sichuan dialect, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Nanjing - Li | Li | A patient yoga teacher | Nanjing dialect, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Shaanxi - Marcus | Marcus | A man with a broad face, few words, a sincere heart, and deep roots | Shaanxi dialect, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Southern Min - Roy | Roy | A humorous, straightforward, lively Taiwanese man | Southern Min, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Tianjin - Peter | Peter | A Tianjin-style crosstalk performer and professional food critic | Tianjin dialect, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Cantonese - Rocky | Rocky | A humorous, witty man providing live commentary | Cantonese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Cantonese - Kiki | Kiki | A sweet Hong Kong girl best friend | Cantonese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Sichuan - Eric | Eric | A Sichuanese man from Chengdu who stands out in every crowd | Sichuan dialect, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
qwen3-omni-flash and qwen3-omni-flash-2025-09-15
| Voice name | voice parameter | Description | Supported languages |
|---|---|---|---|
| Cherry | Cherry | A sunny, positive, friendly, and natural young woman | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Ethan | Ethan | Standard Mandarin with a slight northern accent. Sunny, warm, energetic, and vibrant | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Nofish | Nofish | A designer who cannot pronounce retroflex sounds | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Jennifer | Jennifer | A premium, cinematic-quality American English female voice | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Ryan | Ryan | Full of rhythm, bursting with dramatic flair, balancing authenticity and tension | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Katerina | Katerina | A mature-woman voice with rich, memorable rhythm | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Elias | Elias | Maintains academic rigor while using storytelling techniques to turn complex knowledge into digestible learning modules | Chinese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Shanghai - Jada | Jada | A fast-paced, energetic Shanghai auntie | Shanghainese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Beijing - Dylan | Dylan | A young man raised in Beijing's hutongs | Beijing dialect, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Sichuan - Sunny | Sunny | A Sichuan girl sweet enough to melt your heart | Sichuan dialect, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Nanjing - Li | Li | A patient yoga teacher | Nanjing dialect, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Shaanxi - Marcus | Marcus | A man with a broad face, few words, a sincere heart, and deep roots | Shaanxi dialect, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Southern Min - Roy | Roy | A humorous, straightforward, lively Taiwanese man | Southern Min, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Tianjin - Peter | Peter | A Tianjin-style crosstalk performer and professional food critic | Tianjin dialect, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Cantonese - Rocky | Rocky | A humorous, witty man providing live commentary | Cantonese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Cantonese - Kiki | Kiki | A sweet Hong Kong girl best friend | Cantonese, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
| Sichuan - Eric | Eric | A Sichuanese man from Chengdu who stands out in every crowd | Sichuan dialect, English, French, German, Russian, Italian, Spanish, Portuguese, Japanese, Korean |
Qwen-Omni-Turbo
| Voice name | voice parameter | Description | Supported languages |
|---|---|---|---|
| Cherry | Cherry | A sunny, positive, friendly, and natural young woman | Chinese, English |
| Serena | Serena | A gentle young woman | Chinese, English |
| Ethan | Ethan | Standard Mandarin with a slight northern accent. Sunny, warm, energetic, and vibrant | Chinese, English |
| Chelsie | Chelsie | A two-dimensional virtual girlfriend | Chinese, English |
Open-source Qwen-Omni models
| Voice name | voice parameter | Description | Supported languages |
|---|---|---|---|
| Ethan | Ethan | Standard Mandarin with a slight northern accent. Sunny, warm, energetic, and vibrant | Chinese, English |
| Chelsie | Chelsie | A two-dimensional virtual girlfriend | Chinese, English |