Skip to main content
This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub
The TextEncodeAceStepAudio1.5 node prepares text and audio-related metadata for use with the AceStepAudio 1.5 model. It takes descriptive tags, lyrics, and musical parameters, then uses a CLIP model to convert them into a conditioning format suitable for audio generation.

Inputs

ParameterData TypeRequiredRangeDescription
clipCLIPYesN/AThe CLIP model used to tokenize and encode the input text.
tagsSTRINGYesN/ADescriptive tags for the audio, such as genre, mood, or instruments. Supports multiline input and dynamic prompts.
lyricsSTRINGYesN/AThe lyrics for the audio track. Supports multiline input and dynamic prompts.
seedINTNo0 to 18446744073709551615A random seed value for reproducible generation. Has a control_after_generate widget. Default: 0.
bpmINTNo10 to 300The beats per minute (BPM) for the generated audio. Default: 120.
durationFLOATNo0.0 to 2000.0The desired duration of the audio in seconds. Default: 120.0.
timesignatureCOMBONo"2"
"3"
"4"
"6"
The musical time signature.
languageCOMBONo"en"
"ja"
"zh"
"es"
"de"
"fr"
"pt"
"ru"
"it"
"nl"
"pl"
"tr"
"vi"
"cs"
"fa"
"id"
"ko"
"uk"
"hu"
"ar"
"sv"
"ro"
"el"
The language of the input text.
keyscaleCOMBONo"C major"
"C minor"
"C# major"
"C# minor"
"Db major"
"Db minor"
"D major"
"D minor"
"D# major"
"D# minor"
"Eb major"
"Eb minor"
"E major"
"E minor"
"F major"
"F minor"
"F# major"
"F# minor"
"Gb major"
"Gb minor"
"G major"
"G minor"
"G# major"
"G# minor"
"Ab major"
"Ab minor"
"A major"
"A minor"
"A# major"
"A# minor"
"Bb major"
"Bb minor"
"B major"
"B minor"
The musical key and scale (major or minor).
generate_audio_codesBOOLEANNoN/AEnable the LLM that generates audio codes. This can be slow but will increase the quality of the generated audio. Turn this off if you are giving the model an audio reference. Default: True.
cfg_scaleFLOATNo0.0 to 100.0The classifier-free guidance scale. Higher values make the output more closely follow the prompt. Default: 2.0.
temperatureFLOATNo0.0 to 2.0A sampling temperature. Lower values make the output more deterministic. Default: 0.85.
top_pFLOATNo0.0 to 2000.0The nucleus sampling probability (top-p). Default: 0.9.
top_kINTNo0 to 100The number of highest probability tokens to consider (top-k). Default: 0.

Outputs

Output NameData TypeDescription
CONDITIONINGCONDITIONINGThe conditioning data, which contains the encoded text and audio parameters for the AceStepAudio 1.5 model.