Sound Generation · ElevenLabs API Documentation

provider sound-generation POST /v1/sound-generation

@utdk/elevenlabs /v1/sound-generation

Sound Generation

Turn text into sound effects for your videos, voice-overs or video games using the most advanced sound effects models in the world.

output_format query: Output format of the generated audio. Formatted as codec_sample_rate_bitrate. So an mp3 with 22.05kHz sample rate at 32kbs is represented as mp3_22050_32. MP3 with 192kbps bitrate requires you to be subscribed to Creator tier or above. PCM with 44.1kHz sample rate requires you to be subscribed to Pro tier or above. Note that the μ-law format (sometimes written mu-law, often approximated as u-law) is commonly used for Twilio audio inputs.; enum: mp3_22050_32, mp3_24000_48, mp3_44100_32, mp3_44100_64…
xi-api-key header: Your API key. This is required by most endpoints to access our API programmatically. You can view your xi-api-key using the 'Profile' tab on the website.

Try it

Authentication

Configure credentials for ElevenLabs API Documentation

Gateway

The gateway proxies requests and injects credentials server-side. Configure credentials above, then enter your gateway URL.

Execution Mode

Gateway URL

Saved automatically to browser storage.

soundGeneration

POST/v1/sound-generation

Turn text into sound effects for your videos, voice-overs or video games using the most advanced sound effects models in the world.

output_format

Output format of the generated audio. Formatted as codec_sample_rate_bitrate. So an mp3 with 22.05kHz sample rate at 32kbs is represented as mp3_22050_32. MP3 with 192kbps bitrate requires you to be subscribed to Creator tier or above. PCM with 44.1kHz sample rate requires you to be subscribed to Pro tier or above. Note that the μ-law format (sometimes written mu-law, often approximated as u-law) is commonly used for Twilio audio inputs.

Input

textrequired

The text that will get converted into a sound effect.

loop

Whether to create a sound effect that loops smoothly. Only available for the 'eleven_text_to_sound_v2 model'.

false

duration_seconds

The duration of the sound which will be generated in seconds. Must be at least 0.5 and at most 30. If set to None we will guess the optimal duration using the prompt. Defaults to None.

prompt_influence

A higher prompt influence makes your generation follow the prompt more closely while also making generations less variable. Must be a value between 0 and 1. Defaults to 0.3.

model_id

The model ID to use for the sound generation.

Enter a gateway URL above to enable sending.

Code snippet

Updates live as you fill in the form above.

TypeScript

import elevenlabs from '@utdk/elevenlabs';

await elevenlabs.soundGeneration({
  "loop": false,
  "prompt_influence": "0.3",
  "model_id": "eleven_text_to_sound_v2"
})