TextEncodeAceStepAudio1.5 - ComfyUI Built-in Node Documentation

This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub

The TextEncodeAceStepAudio1.5 node prepares text and audio-related metadata for use with the AceStepAudio 1.5 model. It takes descriptive tags, lyrics, and musical parameters, then uses a CLIP model to convert them into a conditioning format suitable for audio generation.

Inputs

Parameter	Data Type	Required	Range	Description
`clip`	CLIP	Yes	N/A	The CLIP model used to tokenize and encode the input text.
`tags`	STRING	Yes	N/A	Descriptive tags for the audio, such as genre, mood, or instruments. Supports multiline input and dynamic prompts.
`lyrics`	STRING	Yes	N/A	The lyrics for the audio track. Supports multiline input and dynamic prompts.
`seed`	INT	No	0 to 18446744073709551615	A random seed value for reproducible generation. Has a control_after_generate widget. Default: 0.
`bpm`	INT	No	10 to 300	The beats per minute (BPM) for the generated audio. Default: 120.
`duration`	FLOAT	No	0.0 to 2000.0	The desired duration of the audio in seconds. Default: 120.0.
`timesignature`	COMBO	No	`"2"` `"3"` `"4"` `"6"`	The musical time signature.
`language`	COMBO	No	`"en"` `"ja"` `"zh"` `"es"` `"de"` `"fr"` `"pt"` `"ru"` `"it"` `"nl"` `"pl"` `"tr"` `"vi"` `"cs"` `"fa"` `"id"` `"ko"` `"uk"` `"hu"` `"ar"` `"sv"` `"ro"` `"el"`	The language of the input text.
`keyscale`	COMBO	No	`"C major"` `"C minor"` `"C# major"` `"C# minor"` `"Db major"` `"Db minor"` `"D major"` `"D minor"` `"D# major"` `"D# minor"` `"Eb major"` `"Eb minor"` `"E major"` `"E minor"` `"F major"` `"F minor"` `"F# major"` `"F# minor"` `"Gb major"` `"Gb minor"` `"G major"` `"G minor"` `"G# major"` `"G# minor"` `"Ab major"` `"Ab minor"` `"A major"` `"A minor"` `"A# major"` `"A# minor"` `"Bb major"` `"Bb minor"` `"B major"` `"B minor"`	The musical key and scale (major or minor).
`generate_audio_codes`	BOOLEAN	No	N/A	Enable the LLM that generates audio codes. This can be slow but will increase the quality of the generated audio. Turn this off if you are giving the model an audio reference. Default: True.
`cfg_scale`	FLOAT	No	0.0 to 100.0	The classifier-free guidance scale. Higher values make the output more closely follow the prompt. Default: 2.0.
`temperature`	FLOAT	No	0.0 to 2.0	A sampling temperature. Lower values make the output more deterministic. Default: 0.85.
`top_p`	FLOAT	No	0.0 to 2000.0	The nucleus sampling probability (top-p). Default: 0.9.
`top_k`	INT	No	0 to 100	The number of highest probability tokens to consider (top-k). Default: 0.

Outputs

Output Name	Data Type	Description
`CONDITIONING`	CONDITIONING	The conditioning data, which contains the encoded text and audio parameters for the AceStepAudio 1.5 model.

Nodes

​Inputs

​Outputs

Inputs

Outputs