Wan — Animate from first frame

POST

/services/aigc/video-generation/video-synthesis

import base64
import os
from http import HTTPStatus
from dashscope import VideoSynthesis
import mimetypes
import dashscope

dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'


# If you haven't configured the environment variable, replace the next line with your API key: api_key="sk-xxx"
api_key = os.getenv("DASHSCOPE_API_KEY")

# --- Helper function: For Base64 encoding ---
# Format: data:{MIME_type};base64,{base64_data}
def encode_file(file_path):
  mime_type, _ = mimetypes.guess_type(file_path)
  if not mime_type or not mime_type.startswith("image/"):
    raise ValueError("Unsupported or unrecognized image format")
  with open(file_path, "rb") as image_file:
    encoded_string = base64.b64encode(image_file.read()).decode('utf-8')
  return f"data:{mime_type};base64,{encoded_string}"

"""
Image input methods:
Choose one of the following three methods,

1. Use a public URL - Suitable for publicly accessible images
2. Use a local file - Suitable for local development and testing
3. Use Base64 encoding - Suitable for private images or scenarios requiring encrypted transmission
"""

# [Method 1] Use a publicly accessible image URL
# Example: Use a public image URL
img_url = "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/wpimhv/rap.png"

# [Method 2] Use a local file (supports absolute and relative paths)
# Format requirement: file:// + file path
# Example (absolute path):
# img_url = "file://" + "/path/to/your/img.png"    # Linux/macOS
# img_url = "file://" + "/C:/path/to/your/img.png"  # Windows
# Example (relative path):
# img_url = "file://" + "./img.png"                # Relative to the current executable file's path

# [Method 3] Use a Base64-encoded image
# img_url = encode_file("./img.png")

# Set audio URL
audio_url = "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/ozwpvi/rap.mp3"

def sample_call_i2v():
  # Synchronous call, returns result directly
  print('please wait...')
  rsp = VideoSynthesis.call(api_key=api_key,
                              model='wan2.6-i2v-flash',
                              prompt='A scene of urban fantasy art. A dynamic graffiti art character. A boy made of spray paint comes to life from a concrete wall. He raps an English song at high speed while striking a classic, energetic rapper pose. The scene is set under an urban railway bridge at night. The lighting comes from a single street lamp, creating a cinematic atmosphere full of high energy and amazing detail. The audio of the video consists entirely of his rap, with no other dialogue or noise.',
                              img_url=img_url,
                              audio_url=audio_url,
                              resolution="720P",
                              duration=10,
                              prompt_extend=True,
                              watermark=False,
                              negative_prompt="",
                              seed=12345)
  print(rsp)
  if rsp.status_code == HTTPStatus.OK:
    print("video_url:", rsp.output.video_url)
  else:
    print('Failed, status_code: %s, code: %s, message: %s' %
              (rsp.status_code, rsp.code, rsp.message))


if __name__ == '__main__':
  sample_call_i2v()

{
  "request_id": "4909100c-7b5a-9f92-bfe5-xxxxxx",
  "output": {
    "task_id": "0385dc79-5ff8-4d82-bcb6-xxxxxx",
    "task_status": "PENDING"
  }
}

Generate a video from a first-frame image and text prompt.

Authorizations

string

header

required

DashScope API Key. Create one in the Qwen Cloud console.

Header Parameters

enum<string>

required

Must be set to enable. HTTP requests support only asynchronous processing. Omitting this header returns a "current user api does not support synchronous calls" error.

Available options:enable

Body

application/json

enum<string>

required

Model name.

Available options:wan2.6-i2v-flash,wan2.6-i2v,wan2.5-i2v-preview,wan2.2-i2v-flash,wan2.2-i2v-plus,wan2.1-i2v-turbo,wan2.1-i2v-plus

Example:wan2.6-i2v-flash

object

required

Input data including the first-frame image, prompt, and optional audio.

Show child attributes

string

required

URL or Base64 string of the first-frame image.

Image constraints:

Format: JPEG, JPG, PNG (no alpha channel), BMP, WEBP.
Resolution: Width and height must be between 360 and 2,000 pixels.
File size: Up to 10 MB.

Supported input formats:

Public URL (HTTP/HTTPS supported).
Base64-encoded image: data:{MIME_type};base64,{base64_data}.

Example:https://cdn.translate.alibaba.com/r/wanx-demo-1.png

string

Text prompt describing the desired content and visual characteristics for the generated video. Both Chinese and English are supported. Length limits vary by model:

wan2.6 and wan2.5 models: up to 1,500 characters.
wan2.2 and wan2.1 models: up to 800 characters.

Text exceeding the limit is truncated automatically. For prompt tips, see Text-to-video/image-to-video prompt guide.

Example:A cat running on the grass

string

Negative prompt describing content to exclude from the video. Both Chinese and English are supported. Maximum 500 characters; excess is truncated automatically.

Example:low resolution, error, worst quality, low quality

Required range:length <= 500

string

URL of an audio file. The model synchronizes video generation with this audio. Supported by wan2.6 and wan2.5 models only.

Audio constraints:

Format: wav, mp3.
Duration: 3–30 seconds.
File size: Up to 15 MB.
If the audio exceeds the video duration, it is truncated. If shorter, the remaining video is silent.

Example:https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/ozwpvi/rap.mp3

object

Video generation parameters.

Show child attributes

enum<string>

Resolution tier of the generated video. The model scales output to a similar total pixel count. The aspect ratio closely matches the input image. Resolution directly affects cost (1080P > 720P > 480P).

Default values and options by model:

wan2.6-i2v-flash: 720P, 1080P (default: 1080P)
wan2.6-i2v: 720P, 1080P (default: 1080P)
wan2.5-i2v-preview: 480P, 720P, 1080P (default: 1080P)
wan2.2-i2v-flash: 480P, 720P, 1080P (default: 720P)
wan2.2-i2v-plus: 480P, 1080P (default: 1080P)
wan2.1-i2v-turbo: 480P, 720P (default: 720P)
wan2.1-i2v-plus: 720P (fixed)

Available options:480P,720P,1080P

Example:720P

integer

Duration of the generated video in seconds. Longer durations cost more (billed per second).

Valid values by model:

wan2.6-i2v-flash: integer 2–15 (default: 5)
wan2.6-i2v: integer 2–15 (default: 5)
wan2.5-i2v-preview: 5, 10 (default: 5)
wan2.2-i2v-plus: fixed 5 (not configurable)
wan2.2-i2v-flash: fixed 5 (not configurable)
wan2.1-i2v-plus: fixed 5 (not configurable)
wan2.1-i2v-turbo: 3, 4, 5 (default: 5)

Example:5

boolean

defaulttrue

Whether to enable prompt rewriting. When enabled, an LLM rewrites the input prompt, which can improve generation quality for shorter prompts but increases processing time.

Example:true

enum<string>

default"single"

Whether the video uses a single continuous shot or multiple switching shots. Supported by wan2.6 models only. Takes effect only when prompt_extend is true. When specified, overrides shot-related descriptions in the prompt.

Available options:single,multi

Example:single

boolean

defaulttrue

Whether to generate a video with sound. Supported by wan2.6-i2v-flash only. Priority: audio > audio_url. If audio=false, the output is silent even if audio_url is provided. Audio setting affects pricing.

Example:true

boolean

defaultfalse

Whether to add an "AI Generated" watermark in the lower-right corner.

Example:false

integer

Random number seed for reproducibility. Range: [0, 2147483647]. If omitted, a random seed is used. Identical seeds do not guarantee identical results.

Example:12345

Required range:0 <= x <= 2147483647

Response

200-application/json

string

Unique request identifier for tracing and troubleshooting.

Example:4909100c-7b5a-9f92-bfe5-xxxxxx

object

Show child attributes

string

Task ID for polling status. Use with GET /tasks/{task_id}. Valid for 24 hours.

Example:0385dc79-5ff8-4d82-bcb6-xxxxxx

enum<string>

Initial task status, typically PENDING.

Available options:PENDING,RUNNING,SUCCEEDED,FAILED,CANCELED,UNKNOWN

Example:PENDING