Skip to main content
Video gen & edits

Text-to-video

Generate video from text

The Wan text-to-video model supports multimodal input — including text and audio — and generates videos up to 15 seconds long at 1080P resolution.
  • Core capabilities: Supports integer video durations (2-15 seconds), custom video resolutions (480P, 720P, or 1080P), prompt rewriting, and watermarking.
  • Audio capabilities: Supports automatic dubbing or custom audio files for synchronized audio and video. (Supported by wan2.5 and wan2.6)
  • Multi-shot narrative: Generates videos with multiple shots while keeping the main subject consistent across shot transitions. (Supported only by wan2.6)
Quick access: Try it online | API reference | Prompt guide

Getting started

Input promptOutput video (multi-shot, audio-enabled)
A thrilling detective chase story with cinematic storytelling. Shot 1 [0-3 s]: Wide shot of a rainy New York street at night, neon lights flickering, a detective in a black trench coat walking briskly. Shot 2 [3-6 s]: Medium shot of the detective entering an old building, rain soaking his coat, the door closing slowly behind him. Shot 3 [6-9 s]: Close-up of the detective's focused, determined eyes as distant sirens wail and he frowns slightly in thought. Shot 4 [9-12 s]: Medium shot of the detective moving carefully down a dim hallway, his flashlight illuminating the path ahead. Shot 5 [12-15 s]: Close-up of the detective discovering a key clue, his face lighting up with sudden realization.
Before calling the API, get an API key. Then set your API key as an environment variable. To use the SDK, install the DashScope SDK.
  • Python SDK
  • Java SDK
  • curl
Make sure your DashScope Python SDK version is at least 1.25.8 before running the code below.If your version is too low, you may see errors such as "url error, please check url!". Install the SDK.
import os
from http import HTTPStatus
from dashscope import VideoSynthesis
import dashscope

dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'
api_key = os.getenv("DASHSCOPE_API_KEY", "YOUR_API_KEY")

print('please wait...')
rsp = VideoSynthesis.call(api_key=api_key,
                            model='wan2.6-t2v',
                            prompt='A thrilling detective chase story with cinematic storytelling. Shot 1 [0–3 s]: Wide shot of a rainy New York street at night, neon lights flickering, a detective in a black trench coat walking briskly. Shot 2 [3–6 s]: Medium shot of the detective entering an old building, rain soaking his coat, the door closing slowly behind him. Shot 3 [6–9 s]: Close-up of the detective\'s focused, determined eyes as distant sirens wail and he frowns slightly in thought. Shot 4 [9–12 s]: Medium shot of the detective moving carefully down a dim hallway, his flashlight illuminating the path ahead. Shot 5 [12–15 s]: Close-up of the detective discovering a key clue, his face lighting up with sudden realization.',
                            size="1280*720",
                            duration=15,
                            shot_type="multi",
                            prompt_extend=True,
                            watermark=True)
print(rsp)
if rsp.status_code == HTTPStatus.OK:
  print("video_url:", rsp.output.video_url)
else:
  print('Failed, status_code: %s, code: %s, message: %s' % (rsp.status_code, rsp.code, rsp.message))
Sample output
video_url expires after 24 hours. Download the video promptly.
{
  "request_id": "c1209113-8437-424f-a386-xxxxxx",
  "output": {
    "task_id": "966cebcd-dedc-4962-af88-xxxxxx",
    "task_status": "SUCCEEDED",
    "video_url": "https://dashscope-result-sh.oss-accelerate.aliyuncs.com/xxx.mp4?Expires=xxx",
         ...
  },
  ...
}

Availability

Supported models:
ModelFeaturesInput modalitiesOutput video specifications
wan2.6-t2v RecommendedVideo with audio. Multi-shot narrative, audio-video syncText, audioResolution options: 720P, 1080P. Video duration: [2s, 15s] (integer). Defined specifications: 30 fps, MP4 (H.264 encoding)
wan2.5-t2v-preview RecommendedVideo with audio. Audio-video syncText, audioResolution options: 480P, 720P, 1080P. Video duration: 5s, 10s. Defined specifications: 30 fps, MP4 (H.264 encoding)
wan2.2-t2v-plusVideo without audioTextResolution options: 480P, 1080P. Video duration: 5s. Defined specifications: 30 fps, MP4 (H.264 encoding)
wan2.1-t2v-turboVideo without audioTextResolution options: 480P, 720P. Video duration: 5s. Defined specifications: 30 fps, MP4 (H.264 encoding)
wan2.1-t2v-plusVideo without audioTextResolution options: 720P. Video duration: 5s. Defined specifications: 30 fps, MP4 (H.264 encoding)

Core capabilities

Create multi-shot videos

Supported models: wan2.6 series. Description: The model automatically switches between shots — for example, from a wide shot to a close-up — ideal for music videos and similar use cases. Parameters:
  • shot_type: Set to "multi".
  • prompt_extend: Set to true (enables prompt rewriting to optimize shot descriptions).
Input promptOutput video (multi-shot video)
A vision of harmony between future technology and nature. Shot 1 [0-2 s]: Wide shot of an aerial garden in a futuristic city, floating plants swaying gently in the breeze. Shot 2 [2-4 s]: A robot gardener carefully trims plants with precise, graceful movements. Shot 3 [4-7 s]: Sunlight streams through a transparent dome, illuminating the entire garden and showcasing perfect fusion of technology and nature. Shot 4 [7-10 s]: The camera pulls back to reveal the grand scale of the entire futuristic city, with the aerial garden just one part of it.
  • Python SDK
  • Java SDK
  • curl
Make sure your DashScope Python SDK version is at least 1.25.8. Install the SDK.
import os
from http import HTTPStatus
from dashscope import VideoSynthesis
import dashscope

dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'

# If you have not set an environment variable, replace the line below with: api_key="sk-xxx"
api_key = os.getenv("DASHSCOPE_API_KEY")

def sample_async_call_t2v():
  # Asynchronous call returns a task_id
  rsp = VideoSynthesis.async_call(api_key=api_key,
                  model='wan2.6-t2v',
                  prompt='A vision of harmony between future technology and nature. Shot 1 [0–2 s]: Wide shot of an aerial garden in a futuristic city, floating plants swaying gently in the breeze. Shot 2 [2–4 s]: A robot gardener carefully trims plants with precise, graceful movements. Shot 3 [4–7 s]: Sunlight streams through a transparent dome, illuminating the entire garden and showcasing perfect fusion of technology and nature. Shot 4 [7–10 s]: The camera pulls back to reveal the grand scale of the entire futuristic city, with the aerial garden just one part of it.',
                  size='1280*720',
                  shot_type="multi",  # Multi-shot
                  duration=10,
                  prompt_extend=True,
                  watermark=True,
                  negative_prompt="",
                  seed=12345)
  print(rsp)
  if rsp.status_code == HTTPStatus.OK:
    print("task_id: %s" % rsp.output.task_id)
  else:
    print('Failed, status_code: %s, code: %s, message: %s' % (rsp.status_code, rsp.code, rsp.message))

  # Wait for asynchronous task to complete
  rsp = VideoSynthesis.wait(task=rsp, api_key=api_key)
  print(rsp)
  if rsp.status_code == HTTPStatus.OK:
    print(rsp.output.video_url)
  else:
    print('Failed, status_code: %s, code: %s, message: %s' % (rsp.status_code, rsp.code, rsp.message))


if __name__ == '__main__':
  sample_async_call_t2v()

Synchronize audio and video

Supported models: wan2.5 and wan2.6 series. Description: Make characters in photos speak or sing, with mouth movements matching the audio. For more examples, see Video audio generation. Parameters:
  • Provide an audio file: Pass an audio_url. The model aligns mouth movement to the audio.
  • Automatic dubbing: Audio-enabled video is generated by default. Do not pass audio_url. The model auto-generates background sound effects, music, or voice based on the scene.
Input exampleOutput video (audio-enabled video)
Input prompt: Shot from a low angle, in a medium close-up, with warm tones, mixed lighting (the practical light from the desk lamp blends with the overcast light from the window), side lighting, and a central composition. In a classic detective office, wooden bookshelves are filled with old case files and ashtrays. A green desk lamp illuminates a case file spread out in the center of the desk. A fox, wearing a dark brown trench coat and a light gray fedora, sits in a leather chair, its fur crimson, its tail resting lightly on the edge, its fingers slowly turning yellowed pages. Outside, a steady drizzle falls beneath a blue sky, streaking the glass with meandering streaks. It slowly raises its head, its ears twitching slightly, its amber eyes gazing directly at the camera, its mouth clearly moving as it speaks in a smooth, cynical voice: 'The case was cold, colder than a fish in winter. But every chicken has its secrets, and I, for one, intended to find them '. Input audio:
  • Python SDK
  • Java SDK
  • curl
Make sure your DashScope Python SDK version is at least 1.25.8. Install the SDK.
import os
from http import HTTPStatus
from dashscope import VideoSynthesis
import dashscope

dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'

# If you have not set an environment variable, replace the line below with: api_key="sk-xxx"
api_key = os.getenv("DASHSCOPE_API_KEY")

def sample_async_call_t2v():
  # Asynchronous call returns a task_id
  rsp = VideoSynthesis.async_call(api_key=api_key,
                  model='wan2.6-t2v',
                  prompt="Shot from a low angle, in a medium close-up, with warm tones, mixed lighting (the practical light from the desk lamp blends with the overcast light from the window), side lighting, and a central composition. In a classic detective office, wooden bookshelves are filled with old case files and ashtrays. A green desk lamp illuminates a case file spread out in the center of the desk. A fox, wearing a dark brown trench coat and a light gray fedora, sits in a leather chair, its fur crimson, its tail resting lightly on the edge, its fingers slowly turning yellowed pages. Outside, a steady drizzle falls beneath a blue sky, streaking the glass with meandering streaks. It slowly raises its head, its ears twitching slightly, its amber eyes gazing directly at the camera, its mouth clearly moving as it speaks in a smooth, cynical voice: 'The case was cold, colder than a fish in winter. But every chicken has its secrets, and I, for one, intended to find them '.",
                  audio_url='https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250929/stjqnq/%E7%8B%90%E7%8B%B8.mp3',
                  size='1280*720',
                  duration=10,
                  shot_type="multi",  # Multi-shot
                  prompt_extend=True,
                  watermark=True,
                  negative_prompt="",
                  seed=12345)
  print(rsp)
  if rsp.status_code == HTTPStatus.OK:
    print("task_id: %s" % rsp.output.task_id)
  else:
    print('Failed, status_code: %s, code: %s, message: %s' % (rsp.status_code, rsp.code, rsp.message))

  # Wait for asynchronous task to complete
  rsp = VideoSynthesis.wait(task=rsp, api_key=api_key)
  print(rsp)
  if rsp.status_code == HTTPStatus.OK:
    print(rsp.output.video_url)
  else:
    print('Failed, status_code: %s, code: %s, message: %s' % (rsp.status_code, rsp.code, rsp.message))


if __name__ == '__main__':
  sample_async_call_t2v()

Generate silent videos

Supported models: wan2.2 series, wan2.1 series. Description: Ideal for visual-only use cases like animated posters or silent short videos. Parameters: Silent video is the default output for wan2.2 and earlier versions. No extra configuration is needed.
Input promptOutput video (silent video)
Low contrast. A street musician performs in a retro 1970s-style subway station, bathed in dim colors and rough textures. He wears a vintage jacket and plays guitar with intense focus. Commuters rush past. A small crowd gradually gathers to listen. The camera pans slowly right, capturing the interplay of instrument sounds and city noise, with vintage subway signs and peeling walls in the background.
  • Python SDK
  • Java SDK
  • curl
Ensure that the DashScope SDK for Python version is at least 1.25.8. For instructions on how to update, see Installing the SDK.
import os
from http import HTTPStatus
from dashscope import VideoSynthesis
import dashscope

dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'

# If you have not set an environment variable, replace the line below with: api_key="sk-xxx"
api_key = os.getenv("DASHSCOPE_API_KEY")

def sample_async_call_t2v():
  # Asynchronous call returns a task_id
  rsp = VideoSynthesis.async_call(api_key=api_key,
                  model='wan2.2-t2v-plus',
                  prompt='Low contrast. A street musician performs in a retro 1970s-style subway station, bathed in dim colors and rough textures. He wears a vintage jacket and plays guitar with intense focus. Commuters rush past. A small crowd gradually gathers to listen. The camera pans slowly right, capturing the interplay of instrument sounds and city noise, with vintage subway signs and peeling walls in the background.',
                  prompt_extend=True,
                  size='832*480',
                  negative_prompt="",
                  watermark=True,
                  seed=12345)
  print(rsp)
  if rsp.status_code == HTTPStatus.OK:
    print("task_id: %s" % rsp.output.task_id)
  else:
    print('Failed, status_code: %s, code: %s, message: %s' % (rsp.status_code, rsp.code, rsp.message))

  # Wait for asynchronous task to complete
  rsp = VideoSynthesis.wait(task=rsp, api_key=api_key)
  print(rsp)
  if rsp.status_code == HTTPStatus.OK:
    print(rsp.output.video_url)
  else:
    print('Failed, status_code: %s, code: %s, message: %s' % (rsp.status_code, rsp.code, rsp.message))


if __name__ == '__main__':
  sample_async_call_t2v()

Input audio

  • Number of files: One.
  • Input methods:
    • Public URL: Supports HTTP or HTTPS protocols.

Output video

  • Number of files: One.
  • Format: MP4. See Video specifications for details.
  • URL expiration: 24 hours.
  • Dimensions: Determined by the size parameter. For example, when size is set to 1280*720, the output video has a 16:9 aspect ratio.

Billing and rate limits

  • For free quota and pricing details, see Model invocation pricing.
  • For model rate limits, see Rate limits.
  • Billing details:
    • Input is free. Output is billed per successfully generated second of video.
    • Failed model calls or processing errors incur no charge and do not consume your free quota.

API reference

Text-to-video API reference

FAQ

How do I set the video aspect ratio (for example, 16:9)?

Use the size parameter to specify the video resolution. The system calculates the aspect ratio automatically from that resolution. For example, setting size=1280*720 outputs a 16:9 video. Each size maps to a fixed aspect ratio. Choose the resolution that matches your target ratio.

SDK error: "url error, please check url!"

Make sure:
  • Your DashScope Python SDK version is at least 1.25.8.
  • Your DashScope Java SDK version is at least 2.22.6.
If your version is too low, you may see the "url error, please check url!" error. Upgrade the SDK.

Why does the call fail with "Model not exist"?

Check these items:
  • Is the model name spelled correctly?
For a list of supported models, see Supported models.