KlingLipSyncTextToVideoNode - ComfyUI Built-in Node Documentation

This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub

Kling Lip Sync Text to Video Node synchronizes mouth movements in a video file to match a text prompt. It takes an input video and generates a new video where the character’s lip movements are aligned with the provided text. The node uses voice synthesis to create natural-looking speech synchronization.

Inputs

Parameter	Data Type	Required	Range	Description
`video`	VIDEO	Yes	-	Input video file for lip synchronization
`text`	STRING	Yes	-	Text Content for Lip-Sync Video Generation. Required when mode is text2video. Maximum length is 120 characters.
`voice`	COMBO	No	”Melody" "Bella" "Aria" "Ethan" "Ryan" "Dorothy" "Nathan" "Lily" "Aaron" "Emma" "Grace" "Henry" "Isabella" "James" "Katherine" "Liam" "Mia" "Noah" "Olivia" "Sophia”	Voice selection for the lip-sync audio (default: “Melody”)
`voice_speed`	FLOAT	No	0.8-2.0	Speech Rate. Valid range: 0.8~2.0, accurate to one decimal place. (default: 1)

Video Requirements:

Video file should not be larger than 100MB
Height/width should be between 720px and 1920px
Duration should be between 2s and 10s

Outputs

Output Name	Data Type	Description
`output`	VIDEO	Generated video with lip-synchronized audio
`video_id`	STRING	Unique identifier for the generated video
`duration`	STRING	Duration information for the generated video

KlingLipSyncAudioToVideoNode - ComfyUI Built-in Node Documentation

KlingMotionControl - ComfyUI Built-in Node Documentation

Nodes

​Inputs

​Outputs

Inputs

Outputs