The Wan text-to-video model accepts text, images, and audio as input and generates videos up to 15 seconds long at 1080P resolution.
- Core capabilities: Integer video durations (2–15 seconds), custom resolutions (480P, 720P, 1080P), prompt rewriting, and watermarking.
- Audio capabilities: Automatic dubbing or custom audio files for audio-video sync. (Supported by wan2.5 and wan2.6)
- Multi-shot narrative: Multiple shots with consistent main subject across transitions. (Supported only by wan2.6)
Authorizations
string
header
required
DashScope API Key. Get one from the Qwen Cloud console.
Header Parameters
enum<string>
required
Must be set to enable for asynchronous task submission.
enable