ElevenLabsSpeechToSpeech - ComfyUI Built-in Node Documentation

This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub

The ElevenLabs Speech to Speech node transforms an input audio file from one voice to another. It uses the ElevenLabs API to convert speech while preserving the original content and emotional tone of the audio.

Inputs

Parameter	Data Type	Required	Range	Description
`voice`	CUSTOM	Yes	-	Target voice for the transformation. Connect from Voice Selector or Instant Voice Clone.
`audio`	AUDIO	Yes	-	Source audio to transform.
`stability`	FLOAT	No	0.0 - 1.0	Voice stability. Lower values give broader emotional range, higher values produce more consistent but potentially monotonous speech (default: 0.5).
`model`	DYNAMICCOMBO	No	`eleven_multilingual_sts_v2` `eleven_english_sts_v2`	Model to use for speech-to-speech transformation. Each option provides a specific set of voice settings (similarity_boost, style, use_speaker_boost, speed).
`output_format`	COMBO	No	`"mp3_44100_192"` `"opus_48000_192"`	Audio output format (default: “mp3_44100_192”).
`seed`	INT	No	0 - 4294967295	Seed for reproducibility (default: 0).
`remove_background_noise`	BOOLEAN	No	-	Remove background noise from input audio using audio isolation (default: False).

Outputs

Output Name	Data Type	Description
`audio`	AUDIO	The transformed audio file in the specified output format.

Nodes

​Inputs

​Outputs

Inputs

Outputs