Remix A Voice. · ElevenLabs API Documentation

provider text-to-voice POST /v1/text-to-voice/{voice_id}/remix

@utdk/elevenlabs /v1/text-to-voice/{voice_id}/remix

Remix A Voice.

Remix an existing voice via a prompt. This method returns a list of voice previews. Each preview has a generated_voice_id and a sample of the voice as base64 encoded mp3 audio. To create a voice use the generated_voice_id of the preferred preview with the /v1/text-to-voice endpoint.

voice_id path required: Voice ID to be used, you can use https://api.elevenlabs.io/v1/voices to list all the available voices.; string
output_format query: Output format of the generated audio. Formatted as codec_sample_rate_bitrate. So an mp3 with 22.05kHz sample rate at 32kbs is represented as mp3_22050_32. MP3 with 192kbps bitrate requires you to be subscribed to Creator tier or above. PCM with 44.1kHz sample rate requires you to be subscribed to Pro tier or above. Note that the μ-law format (sometimes written mu-law, often approximated as u-law) is commonly used for Twilio audio inputs.; enum: mp3_22050_32, mp3_24000_48, mp3_44100_32, mp3_44100_64…
xi-api-key header: Your API key. This is required by most endpoints to access our API programmatically. You can view your xi-api-key using the 'Profile' tab on the website.

Try it

Authentication

Configure credentials for ElevenLabs API Documentation

Gateway

The gateway proxies requests and injects credentials server-side. Configure credentials above, then enter your gateway URL.

Execution Mode

Gateway URL

Saved automatically to browser storage.

textToVoiceRemix

POST/v1/text-to-voice/{voice_id}/remix

Parameters

voice_idrequired

Voice ID to be used, you can use https://api.elevenlabs.io/v1/voices to list all the available voices.

output_format

Output format of the generated audio. Formatted as codec_sample_rate_bitrate. So an mp3 with 22.05kHz sample rate at 32kbs is represented as mp3_22050_32. MP3 with 192kbps bitrate requires you to be subscribed to Creator tier or above. PCM with 44.1kHz sample rate requires you to be subscribed to Pro tier or above. Note that the μ-law format (sometimes written mu-law, often approximated as u-law) is commonly used for Twilio audio inputs.

Input

voice_descriptionrequired

Description of the changes to make to the voice.

text

Text to generate, text length has to be between 100 and 1000.

auto_generate_text

Whether to automatically generate a text suitable for the voice description.

false

loudness

Controls the volume level of the generated voice. -1 is quietest, 1 is loudest, 0 corresponds to roughly -24 LUFS.

seed

Random number that controls the voice generation. Same seed with same inputs produces same voice.

guidance_scale

Controls how closely the AI follows the prompt. Lower numbers give the AI more freedom to be creative, while higher numbers force it to stick more to the prompt. High numbers can cause voice to sound artificial or robotic. We recommend to use longer, more detailed prompts at lower Guidance Scale.

stream_previews

Determines whether the Text to Voice previews should be included in the response. If true, only the generated IDs will be returned which can then be streamed via the /v1/text-to-voice/:generated_voice_id/stream endpoint.

false

remixing_session_id

The remixing session id.

remixing_session_iteration_id

The id of the remixing session iteration where these generations should be attached to. If not provided, a new iteration will be created.

prompt_strength

Controls the balance of prompt versus reference audio when generating voice samples. 0 means almost no prompt influence, 1 means almost no reference audio influence. Only supported when using the eleven_ttv_v3 model.

Enter a gateway URL above to enable sending.

Code snippet

Updates live as you fill in the form above.

TypeScript

import elevenlabs from '@utdk/elevenlabs';

await elevenlabs.textToVoiceRemix({
  "auto_generate_text": false,
  "loudness": 0.5,
  "guidance_scale": 2,
  "stream_previews": false
})