This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHubThe ElevenLabs Text to Speech node converts written text into spoken audio using the ElevenLabs API. It allows you to select a specific voice and fine-tune various speech characteristics like stability, speed, and style to generate a customized audio output.
Inputs
| Parameter | Data Type | Required | Range | Description |
|---|---|---|---|---|
voice | CUSTOM | Yes | N/A | Voice to use for speech synthesis. Connect from Voice Selector or Instant Voice Clone. |
text | STRING | Yes | N/A | The text to convert to speech. |
stability | FLOAT | No | 0.0 - 1.0 | Voice stability. Lower values give broader emotional range, higher values produce more consistent but potentially monotonous speech (default: 0.5). |
apply_text_normalization | COMBO | No | "auto""on""off" | Text normalization mode. ‘auto’ lets the system decide, ‘on’ always applies normalization, ‘off’ skips it. |
model | DYNAMICCOMBO | No | "eleven_multilingual_v2""eleven_v3" | Model to use for text-to-speech. Selecting a model reveals its specific parameters. |
language_code | STRING | No | N/A | ISO-639-1 or ISO-639-3 language code (e.g., ‘en’, ‘es’, ‘fra’). Leave empty for automatic detection (default: ""). |
seed | INT | No | 0 - 2147483647 | Seed for reproducibility (determinism not guaranteed) (default: 1). |
output_format | COMBO | No | "mp3_44100_192""opus_48000_192" | Audio output format. |
model parameter is set to "eleven_multilingual_v2", the following additional parameters become available:
speed: Speech speed. 1.0 is normal, <1.0 slower, >1.0 faster (default: 1.0, range: 0.7 - 1.3).similarity_boost: Similarity boost. Higher values make the voice more similar to the original (default: 0.75, range: 0.0 - 1.0).use_speaker_boost: Boost similarity to the original speaker voice (default: False).style: Style exaggeration. Higher values increase stylistic expression but may reduce stability (default: 0.0, range: 0.0 - 0.2).
model parameter is set to "eleven_v3", the following additional parameters become available:
speed: Speech speed. 1.0 is normal, <1.0 slower, >1.0 faster (default: 1.0, range: 0.7 - 1.3).similarity_boost: Similarity boost. Higher values make the voice more similar to the original (default: 0.75, range: 0.0 - 1.0).
Outputs
| Output Name | Data Type | Description |
|---|---|---|
audio | AUDIO | The generated audio from the text-to-speech conversion. |