Skip to main content
This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub
The Wan22ImageToVideoLatent node creates video latent representations from images. It generates a blank video latent space with specified dimensions and can optionally encode a starting image sequence into the beginning frames. When a start image is provided, it encodes the image into the latent space and creates a corresponding noise mask for the inpainted regions.

Inputs

ParameterData TypeRequiredRangeDescription
vaeVAEYes-The VAE model used for encoding images into latent space
widthINTNo32 to MAX_RESOLUTIONThe width of the output video in pixels (default: 1280, step: 32)
heightINTNo32 to MAX_RESOLUTIONThe height of the output video in pixels (default: 704, step: 32)
lengthINTNo1 to MAX_RESOLUTIONThe number of frames in the video sequence (default: 49, step: 4)
batch_sizeINTNo1 to 4096The number of batches to generate (default: 1)
start_imageIMAGENo-Optional starting image sequence to encode into the video latent
Note: When start_image is provided, the node encodes the image sequence into the beginning frames of the latent space and generates a corresponding noise mask. The width and height parameters must be divisible by 16 for proper latent space dimensions.

Outputs

Output NameData TypeDescription
samplesLATENTThe generated video latent representation
noise_maskLATENTThe noise mask indicating which regions should be denoised during generation