LTXVConcatAVLatent - ComfyUI Built-in Node Documentation

This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub

The LTXVConcatAVLatent node combines a video latent representation and an audio latent representation into a single, concatenated latent output. It merges the samples tensors from both inputs and, if present, their noise_mask tensors as well, preparing them for further processing in a video generation pipeline.

Inputs

Parameter	Data Type	Required	Range	Description
`video_latent`	LATENT	Yes		The latent representation of the video data.
`audio_latent`	LATENT	Yes		The latent representation of the audio data.

Note: The samples tensors from the video_latent and audio_latent inputs are concatenated. If either input contains a noise_mask, it will be used; if one is missing, a mask of ones (same shape as the corresponding samples) is created for it. The resulting masks are then also concatenated.

Outputs

Output Name	Data Type	Description
`latent`	LATENT	A single latent dictionary containing the concatenated `samples` and, if applicable, the concatenated `noise_mask` from the video and audio inputs.

EmptyLTXVLatentVideo - ComfyUI Built-in Node Documentation

LTXVSeparateAVLatent - ComfyUI Built-in Node Documentation

Nodes

​Inputs

​Outputs

Inputs

Outputs