Skip to main content
This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub
The WanMoveTrackToVideo node prepares conditioning and latent space data for video generation, incorporating optional motion tracking information. It encodes a starting image sequence into a latent representation and can blend in positional data from object tracks to guide the motion in the generated video. The node outputs modified positive and negative conditioning along with an empty latent tensor ready for a video model.

Inputs

ParameterData TypeRequiredRangeDescription
positiveCONDITIONINGYes-The positive conditioning input to be modified.
negativeCONDITIONINGYes-The negative conditioning input to be modified.
vaeVAEYes-The VAE model used to encode the starting image into the latent space.
tracksTRACKSNo-Optional motion tracking data containing object paths.
strengthFLOATNo0.0 - 100.0Strength of the track conditioning. (default: 1.0)
widthINTNo16 - MAX_RESOLUTIONThe width of the output video. Must be divisible by 16. (default: 832)
heightINTNo16 - MAX_RESOLUTIONThe height of the output video. Must be divisible by 16. (default: 480)
lengthINTNo1 - MAX_RESOLUTIONThe number of frames in the video sequence. (default: 81)
batch_sizeINTNo1 - 4096The batch size for the latent output. (default: 1)
start_imageIMAGEYes-The starting image or image sequence to encode.
clip_vision_outputCLIPVISIONOUTPUTNo-Optional CLIP vision model output to add to the conditioning.
Note: The strength parameter only has an effect when tracks are provided. If tracks are not provided or strength is 0.0, the track conditioning is not applied. The start_image is used to create a latent image and mask for the conditioning; if it is not provided, the node only passes through the conditioning and outputs an empty latent.

Outputs

Output NameData TypeDescription
positiveCONDITIONINGThe modified positive conditioning, potentially containing concat_latent_image, concat_mask, and clip_vision_output.
negativeCONDITIONINGThe modified negative conditioning, potentially containing concat_latent_image, concat_mask, and clip_vision_output.
latentLATENTAn empty latent tensor with dimensions shaped by the batch_size, length, height, and width inputs.