Text-to-speech, voice cloning, and conversational agents — everything you need to ship voice features
Turn what you learned into a concrete stack decision.
01AI Agent ToolboxBuild autonomous AI agents that can read, write, and actWant the shortlist in your inbox?
Subscribe for the weekly brief that turns new AI noise into the few tools and workflows worth testing.
Curated bundles that help you move from this guide into a working stack.
Guide
Master Your AI Costs: Reading the Anthropic Console
Stop burning money on API calls — learn to read your dashboard, spot waste, and optimize like a pro
Guide
How to Build an AI Agent That Makes Money
From toy project to revenue — practical monetization patterns for autonomous AI agents
Guide
Claude Code + Agents: Advanced Configuration and Autonomous Workflows
CLAUDE.md architecture, subagents, hooks, MCP servers, and building Claude pipelines that run without you
ElevenLabs is an AI audio platform that turns text into lifelike speech, clones voices, generates music and sound effects, and powers real-time conversational voice agents. If you're building anything that needs to talk, listen, or process audio — this is the platform to know.
The platform is organized into three product families:
Rule of thumb: Use Flash for anything real-time. Use v3 or Multilingual v2 for pre-rendered content. Use Turbo for throwaway prototypes.
| Plan | Price/mo | Credits | What You Unlock | |---|---|---|---| | Free | $0 | 10,000 chars | 3 custom voices, non-commercial only | | Starter | $5 | 30,000 chars | Commercial rights, Instant Voice Cloning | | Creator | $22 | 100,000 chars | Professional Voice Cloning | | Pro | $99 | 500,000 chars | Higher concurrency, priority queue | | Scale | $330 | 2M chars | Enterprise features | | Business | $1,320 | 11M credits | Custom SLAs, dedicated support |
Annual billing saves ~17%. Flash/Turbo models cost fewer credits per character than standard models.
The free tier is enough to prototype and test. You'll hit Starter territory once you're generating audio for production use.
export ELEVENLABS_API_KEY="your_key_here"
JavaScript / TypeScript:
npm install @elevenlabs/elevenlabs-js
Python:
pip install elevenlabs
TypeScript:
import { ElevenLabsClient } from "@elevenlabs/elevenlabs-js";
const client = new ElevenLabsClient();
const audio = await client.textToSpeech.convert("JBFqnCBsd6RMkjVDRZzb", {
text: "Welcome to AI Bazaar. Discover the best AI tools in under 60 seconds.",
modelId: "eleven_multilingual_v2",
outputFormat: "mp3_44100_128",
});
Python:
from elevenlabs import ElevenLabs
client = ElevenLabs()
audio = client.text_to_speech.convert(
voice_id="JBFqnCBsd6RMkjVDRZzb",
text="Welcome to AI Bazaar. Discover the best AI tools in under 60 seconds.",
model_id="eleven_multilingual_v2",
output_format="mp3_44100_128"
)
That's it. JBFqnCBsd6RMkjVDRZzb is "George," one of the default voices. You'll get back a binary audio stream you can save to a file or stream directly.
If you prefer raw HTTP:
POST https://api.elevenlabs.io/v1/text-to-speech/{voice_id}
Headers:
Content-Type: application/json
xi-api-key: your_api_key
Body:
{
"text": "Your text here",
"model_id": "eleven_multilingual_v2",
"voice_settings": {
"stability": 0.5,
"similarity_boost": 0.75,
"speed": 1.0
}
}
Response: Binary audio stream (application/octet-stream).
Output format options: mp3_44100_128, mp3_22050_32, pcm_16000, pcm_22050, pcm_24000, pcm_44100, ulaw_8000
Keep recordings under 3 minutes — more audio doesn't improve quality and can actually hurt it.
Both are available via API or the dashboard UI.
ElevenAgents lets you build voice agents that handle real-time phone calls and conversations. The system chains four components:
You can have a working voice agent in about 5 minutes using the ElevenAgents dashboard.
| Resource | Link |
|---|---|
| Platform | elevenlabs.io |
| API Docs | elevenlabs.io/docs/api-reference/introduction |
| Developer Portal | elevenlabs.io/developers |
| Voice Agents | elevenlabs.io/conversational-ai |
| Pricing | elevenlabs.io/pricing |
| JS/TS SDK | npm install @elevenlabs/elevenlabs-js |
| Python SDK | pip install elevenlabs |
ElevenLabs is the fastest path to shipping voice features. The free tier lets you prototype, the API is clean, and the model quality is best-in-class. If your project needs to speak, listen, clone voices, or handle phone calls — start here.