This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHubThe TextEncodeAceStepAudio1.5 node prepares text and audio-related metadata for use with the AceStepAudio 1.5 model. It takes descriptive tags, lyrics, and musical parameters, then uses a CLIP model to convert them into a conditioning format suitable for audio generation.
Inputs
| Parameter | Data Type | Required | Range | Description |
|---|---|---|---|---|
clip | CLIP | Yes | N/A | The CLIP model used to tokenize and encode the input text. |
tags | STRING | Yes | N/A | Descriptive tags for the audio, such as genre, mood, or instruments. Supports multiline input and dynamic prompts. |
lyrics | STRING | Yes | N/A | The lyrics for the audio track. Supports multiline input and dynamic prompts. |
seed | INT | No | 0 to 18446744073709551615 | A random seed value for reproducible generation. Has a control_after_generate widget. Default: 0. |
bpm | INT | No | 10 to 300 | The beats per minute (BPM) for the generated audio. Default: 120. |
duration | FLOAT | No | 0.0 to 2000.0 | The desired duration of the audio in seconds. Default: 120.0. |
timesignature | COMBO | No | "2""3""4""6" | The musical time signature. |
language | COMBO | No | "en""ja""zh""es""de""fr""pt""ru""it""nl""pl""tr""vi""cs""fa""id""ko""uk""hu""ar""sv""ro""el" | The language of the input text. |
keyscale | COMBO | No | "C major""C minor""C# major""C# minor""Db major""Db minor""D major""D minor""D# major""D# minor""Eb major""Eb minor""E major""E minor""F major""F minor""F# major""F# minor""Gb major""Gb minor""G major""G minor""G# major""G# minor""Ab major""Ab minor""A major""A minor""A# major""A# minor""Bb major""Bb minor""B major""B minor" | The musical key and scale (major or minor). |
generate_audio_codes | BOOLEAN | No | N/A | Enable the LLM that generates audio codes. This can be slow but will increase the quality of the generated audio. Turn this off if you are giving the model an audio reference. Default: True. |
cfg_scale | FLOAT | No | 0.0 to 100.0 | The classifier-free guidance scale. Higher values make the output more closely follow the prompt. Default: 2.0. |
temperature | FLOAT | No | 0.0 to 2.0 | A sampling temperature. Lower values make the output more deterministic. Default: 0.85. |
top_p | FLOAT | No | 0.0 to 2000.0 | The nucleus sampling probability (top-p). Default: 0.9. |
top_k | INT | No | 0 to 100 | The number of highest probability tokens to consider (top-k). Default: 0. |
Outputs
| Output Name | Data Type | Description |
|---|---|---|
CONDITIONING | CONDITIONING | The conditioning data, which contains the encoded text and audio parameters for the AceStepAudio 1.5 model. |