MonoClaw

ElevenLabs (Premium TTS)

ElevenLabs provides state-of-the-art neural text-to-speech with natural-sounding voices, voice cloning, and multi-language support.

When to use ElevenLabs

  • You want the most natural-sounding voice output
  • You need voice cloning (replicate a specific voice)
  • You want multi-lingual TTS (Cantonese, Mandarin, English, etc.)
  • The free Edge TTS voice quality is insufficient

Installation

The ElevenLabs extra is included in the [all] install. If you installed the minimal bundle:

cd ~/.monoclaw/monoclaw-runtime
uv pip install -e ".[tts-premium]"

Setup

  1. Sign up at elevenlabs.io
  2. Copy your API key from the dashboard
  3. Configure MonoClaw:
monoclaw config set ELEVENLABS_API_KEY "your-key"
monoclaw config set tts.provider elevenlabs

Or in config.yaml:

tts:
  provider: elevenlabs
  elevenlabs:
    voice_id: "21m00Tcm4TlvDq8ikWAM"  # Default: Rachel
    model: "eleven_multilingual_v2"

Voice selection

Browse available voices at elevenlabs.io/voice-library. Popular choices:

Voice IDNameStyle
21m00Tcm4TlvDq8ikWAMRachelNatural, professional
AZnzlk1XvdvUeBnXmlldDomiEnergetic
EXAVITQu4vr4xnSDxMaLBellaWarm, friendly

Set a voice:

monoclaw config set tts.elevenlabs.voice_id "AZnzlk1XvdvUeBnXmlld"

Voice cloning

You can clone your own voice:

  1. Go to elevenlabs.io/voice-lab
  2. Upload 1–30 minutes of clean audio
  3. Copy the new voice ID into your config

Using TTS

In the CLI or any messaging platform:

/voice on

Mona will speak her replies. Press Ctrl+B in the CLI to record a voice message.

Cost

ElevenLabs charges per character:

  • ~$0.003 per 1K characters on the Starter plan
  • Free tier: 10K characters/month

Monitor usage in the ElevenLabs dashboard.

Comparison with Edge TTS

Edge TTS (free)ElevenLabs
CostFreePaid per character
QualityGoodExcellent
Voice cloningNoYes
CantoneseLimitedYes (multilingual v2)
Latency~1s~2–4s

Troubleshooting

ProblemFix
"ElevenLabs API key not set"Run monoclaw config set ELEVENLABS_API_KEY
Slow voice generationLower the stability setting or use a simpler voice
Garbled outputCheck you're using eleven_multilingual_v2 for non-English