This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHubThe Kandinsky5ImageToVideo node prepares conditioning and latent space data for video generation using the Kandinsky model. It creates an empty video latent tensor and can optionally encode a starting image to guide the initial frames of the generated video, modifying the positive and negative conditioning accordingly.
Inputs
| Parameter | Data Type | Required | Range | Description |
|---|---|---|---|---|
positive | CONDITIONING | Yes | N/A | The positive conditioning prompts to guide the video generation. |
negative | CONDITIONING | Yes | N/A | The negative conditioning prompts to steer the video generation away from certain concepts. |
vae | VAE | Yes | N/A | The VAE model used to encode the optional starting image into the latent space. |
width | INT | No | 16 to 8192 (step 16) | The width of the output video in pixels (default: 768). |
height | INT | No | 16 to 8192 (step 16) | The height of the output video in pixels (default: 512). |
length | INT | No | 1 to 8192 (step 4) | The number of frames in the video (default: 121). |
batch_size | INT | No | 1 to 4096 | The number of video sequences to generate simultaneously (default: 1). |
start_image | IMAGE | No | N/A | An optional starting image. If provided, it is encoded and used to replace the noisy start of the model’s output latents. |
start_image is provided, it is automatically resized to match the specified width and height using bilinear interpolation. The first length frames of the image batch are used for encoding. The encoded latent is then injected into both the positive and negative conditioning to guide the video’s initial appearance.
Outputs
| Output Name | Data Type | Description |
|---|---|---|
positive | CONDITIONING | The modified positive conditioning, potentially updated with encoded start image data. |
negative | CONDITIONING | The modified negative conditioning, potentially updated with encoded start image data. |
latent | LATENT | An empty video latent tensor with zeros, shaped for the specified dimensions. |
cond_latent | LATENT | The clean, encoded latent representation of the provided start images. This is used internally to replace the noisy beginning of the generated video latents. |