Skip to main content
This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub
The LTXVImgToVideoInplace node conditions a video latent representation by encoding an input image into its initial frames. It works by using a VAE to encode the image into the latent space and then blending it with the existing latent samples based on a specified strength. This allows an image to serve as a starting point or conditioning signal for video generation.

Inputs

ParameterData TypeRequiredRangeDescription
vaeVAEYes-The VAE model used to encode the input image into the latent space.
imageIMAGEYes-The input image to be encoded and used to condition the video latent.
latentLATENTYes-The target latent video representation to be modified.
strengthFLOATNo0.0 - 1.0Controls the blending strength of the encoded image into the latent. A value of 1.0 fully replaces the initial frames, while lower values blend them. (default: 1.0)
bypassBOOLEANNo-Bypass the conditioning. When enabled, the node returns the input latent unchanged. (default: False)
Note: The image will be automatically resized to match the spatial dimensions required by the vae for encoding, based on the latent input’s width and height.

Outputs

Output NameData TypeDescription
latentLATENTThe modified latent video representation. It contains the updated samples and a noise_mask that applies the conditioning strength to the initial frames.