Skip to main content
This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub
The LTXVImgToVideo node converts an input image into a video latent representation for video generation models. It takes a single image and extends it into a sequence of frames using the VAE encoder, then applies conditioning with strength control to determine how much of the original image content is preserved versus modified during video generation.

Inputs

ParameterData TypeRequiredRangeDescription
positiveCONDITIONINGYes-Positive conditioning prompts for guiding the video generation
negativeCONDITIONINGYes-Negative conditioning prompts for avoiding certain elements in the video
vaeVAEYes-VAE model used for encoding the input image into latent space
imageIMAGEYes-Input image to be converted into video frames
widthINTNo64 to MAX_RESOLUTIONOutput video width in pixels (default: 768, step: 32)
heightINTNo64 to MAX_RESOLUTIONOutput video height in pixels (default: 512, step: 32)
lengthINTNo9 to MAX_RESOLUTIONNumber of frames in the generated video (default: 97, step: 8)
batch_sizeINTNo1 to 4096Number of videos to generate simultaneously (default: 1)
strengthFLOATNo0.0 to 1.0Control over how much the original image is modified during video generation, where 1.0 preserves most of the original content and 0.0 allows maximum modification (default: 1.0)

Outputs

Output NameData TypeDescription
positiveCONDITIONINGProcessed positive conditioning with video frame masking applied
negativeCONDITIONINGProcessed negative conditioning with video frame masking applied
latentLATENTVideo latent representation containing the encoded frames and noise mask for video generation