This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHubThe WanAnimateToVideo node generates video content by combining multiple conditioning inputs including pose references, facial expressions, and background elements. It processes various video inputs to create coherent animated sequences while maintaining temporal consistency across frames. The node handles latent space operations and can extend existing videos by continuing motion patterns.
Inputs
| Parameter | Data Type | Required | Range | Description |
|---|---|---|---|---|
positive | CONDITIONING | Yes | - | Positive conditioning for guiding the generation towards desired content |
negative | CONDITIONING | Yes | - | Negative conditioning for steering the generation away from unwanted content |
vae | VAE | Yes | - | VAE model used for encoding and decoding image data |
width | INT | No | 16 to MAX_RESOLUTION | Output video width in pixels (default: 832, step: 16) |
height | INT | No | 16 to MAX_RESOLUTION | Output video height in pixels (default: 480, step: 16) |
length | INT | No | 1 to MAX_RESOLUTION | Number of frames to generate (default: 77, step: 4) |
batch_size | INT | No | 1 to 4096 | Number of videos to generate simultaneously (default: 1) |
clip_vision_output | CLIP_VISION_OUTPUT | No | - | Optional CLIP vision model output for additional conditioning |
reference_image | IMAGE | No | - | Reference image used as starting point for generation |
face_video | IMAGE | No | - | Video input providing facial expression guidance |
pose_video | IMAGE | No | - | Video input providing pose and motion guidance |
continue_motion_max_frames | INT | No | 1 to MAX_RESOLUTION | Maximum number of frames to continue from previous motion (default: 5, step: 4) |
background_video | IMAGE | No | - | Background video to composite with generated content |
character_mask | MASK | No | - | Mask defining character regions for selective processing |
continue_motion | IMAGE | No | - | Previous motion sequence to continue from for temporal consistency |
video_frame_offset | INT | No | 0 to MAX_RESOLUTION | The amount of frames to seek in all the input videos. Used for generating longer videos by chunk. Connect to the video_frame_offset output of the previous node for extending a video. (default: 0, step: 1) |
- When
pose_videois provided andtrim_to_pose_videologic is active, the output length will be adjusted to match the pose video duration face_videois automatically resized to 512x512 resolution when processedcontinue_motionframes are limited bycontinue_motion_max_framesparameter- Input videos (
face_video,pose_video,background_video,character_mask) are offset byvideo_frame_offsetbefore processing - If
character_maskcontains only one frame, it will be repeated across all frames - When
clip_vision_outputis provided, it’s applied to both positive and negative conditioning
Outputs
| Output Name | Data Type | Description |
|---|---|---|
positive | CONDITIONING | Modified positive conditioning with additional video context |
negative | CONDITIONING | Modified negative conditioning with additional video context |
latent | LATENT | Generated video content in latent space format |
trim_latent | INT | Latent space trimming information for downstream processing |
trim_image | INT | Image space trimming information for reference motion frames |
video_frame_offset | INT | Updated frame offset for continuing video generation in chunks |