MakeTrainingDataset - ComfyUI Built-in Node Documentation

This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub

This node prepares data for training by encoding images and text. It takes a list of images and a corresponding list of text captions, then uses a VAE model to convert the images into latent representations and a CLIP model to convert the text into conditioning data. The resulting paired latents and conditioning are output as lists, ready for use in training workflows.

Inputs

Parameter	Data Type	Required	Range	Description
`images`	IMAGE	Yes	N/A	List of images to encode.
`vae`	VAE	Yes	N/A	VAE model for encoding images to latents.
`clip`	CLIP	Yes	N/A	CLIP model for encoding text to conditioning.
`texts`	STRING	No	N/A	List of text captions. Can be length n (matching images), 1 (repeated for all), or omitted (uses empty string).

Parameter Constraints:

The number of items in the texts list must be 0, 1, or exactly match the number of items in the images list. If it is 0, an empty string is used for all images. If it is 1, that single text is repeated for all images.

Outputs

Output Name	Data Type	Description
`latents`	LATENT	List of latent dicts.
`conditioning`	CONDITIONING	List of conditioning lists.

LTXVPreprocess - ComfyUI Built-in Node Documentation

PreviewImage - ComfyUI Built-in Node Documentation

Nodes

​Inputs

​Outputs

Inputs

Outputs