Skip to main content
This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub
Kling Lip Sync Text to Video Node synchronizes mouth movements in a video file to match a text prompt. It takes an input video and generates a new video where the character’s lip movements are aligned with the provided text. The node uses voice synthesis to create natural-looking speech synchronization.

Inputs

ParameterData TypeRequiredRangeDescription
videoVIDEOYes-Input video file for lip synchronization
textSTRINGYes-Text Content for Lip-Sync Video Generation. Required when mode is text2video. Maximum length is 120 characters.
voiceCOMBONo”Melody"
"Bella"
"Aria"
"Ethan"
"Ryan"
"Dorothy"
"Nathan"
"Lily"
"Aaron"
"Emma"
"Grace"
"Henry"
"Isabella"
"James"
"Katherine"
"Liam"
"Mia"
"Noah"
"Olivia"
"Sophia”
Voice selection for the lip-sync audio (default: “Melody”)
voice_speedFLOATNo0.8-2.0Speech Rate. Valid range: 0.8~2.0, accurate to one decimal place. (default: 1)
Video Requirements:
  • Video file should not be larger than 100MB
  • Height/width should be between 720px and 1920px
  • Duration should be between 2s and 10s

Outputs

Output NameData TypeDescription
outputVIDEOGenerated video with lip-synchronized audio
video_idSTRINGUnique identifier for the generated video
durationSTRINGDuration information for the generated video