Eleven Labs Integration

What the Integration Does

Eleven Labs integration enables AI-based text-to-speech and sound effect generation, providing high-quality audio outputs from textual inputs. It's primarily used for creating dynamic audio content, personalized voiceovers, and enhancing interactive applications with audio capabilities.

Example Scenarios

Convert customer support text prompts into friendly, human-like audio interventions for enhanced user engagement.
Generate sound effects dynamically based on user input in gaming applications.
Create personalized voice messages or audio articles from written content.

Capabilities

What the Integration Enables

List Available Voices
- Retrieve a detailed list of voice options including IDs, names, and descriptions.
Text-to-Speech Conversion
- Convert text into speech audio using specified voice settings.
Generate Sound Effects
- Create unique sound effects based on provided text prompts and optional duration.

Input/Output Schemas

Text-to-Speech Conversion
- Input: text (string), voice_id (string), model_id (string), output_format (string)
- Output: base64_audio (string)
Generate Sound Effects
- Input: prompt (string), duration_seconds (optional float between 0.5 and 22)
- Output: base64_audio (string)

Limitations

Audio duration for sound effects is restricted between 0.5 and 22 seconds.
Certain output formats may require specific tiers or configurations.

Setup & Configuration

Prerequisites

An Eleven Labs account is required.
Obtain an API key from the Eleven Labs platform.

Authentication

API key must be set in environment variable ELEVEN_LABS_API_KEY.
- Example: export ELEVEN_LABS_API_KEY="YOUR_API_KEY"

Step-by-Step Guide

Acquire your API key by logging into Eleven Labs and navigating to API settings.
Configure your environment with the API key: export ELEVEN_LABS_API_KEY="YOUR_API_KEY".
Optionally, set the target directory for storing audio outputs.

Testing Connection

Verify by calling the get_voices function to ensure proper API communication.

How to Use in Agents

Example code snippet to add in your agent's toolkit:

python eleven_labs_tools = ElevenLabsTools(voice_id="your_voice_id", api_key="your_api_key") agent.register_toolkit(eleven_labs_tools)

Utilize the text_to_speech or generate_sound_effect methods to process text inputs.

Best Practices

Optimize text input length to manage processing time and ensure quality audio output.
Consider audio format and size based on application needs and tier limitations.

Reference Section

API documentation provided by Eleven Labs: API Documentation
Supported audio formats are explicitly defined for processing precision.

PreviousDuckDuckGo

NextEmail