Voice Mode

Voice mode lets you talk to Mona instead of typing. It works in the CLI, Telegram, and Discord.

Prerequisites

Install the voice extra:

cd ~/.monoclaw/monoclaw-runtime
uv pip install -e ".[voice]"

This installs:

/voice on

Press Ctrl+B to start recording. Press again to stop.

Mona will:

/voice off

Send a voice message to Mona on Telegram. She will:

No special setup needed — voice messages work out of the box.

Mona can join Discord voice channels:

Voice mode uses your configured TTS provider for responses:

# ~/.monoclaw/config.yaml
tts:
  provider: edge-tts    # edge-tts | elevenlabs
  edge-tts:
    voice: "zh-HK-HiuMaanNeural"

For higher quality, use ElevenLabs:

uv pip install -e ".[tts-premium]"
monoclaw config set tts.provider elevenlabs

Speech-to-text uses faster-whisper locally:

voice:
  stt:
    model: "base"        # tiny | base | small | medium | large
    language: "auto"     # auto-detect or specify (en, zh, etc.)

Larger models are more accurate but slower and use more memory.

Problem	Fix
"Voice mode not available"	Install the voice extra
"No audio input detected"	Check microphone permissions
Transcription is poor	Try a larger whisper model
High latency	Use `base` model instead of `large`
Discord voice not working	Ensure Mona has voice channel permissions