MonoClaw

Configuring Models

The model you choose determines Mona's capabilities, cost, and latency. MonoClaw supports 20+ providers and any OpenAI-compatible endpoint.

Quick setup

monoclaw model

This interactive wizard walks you through provider selection, model choice, and API key configuration.

Supported providers

ProviderSetupNotes
OpenRouterAPI keyMulti-provider routing, free tier available
AnthropicAPI keyClaude models, high quality
OpenAIAPI keyGPT models, Codex
DeepSeekAPI keyCost-effective, strong reasoning
Kimi / MoonshotAPI keyCoding specialist
Alibaba CloudAPI keyQwen models
Hugging FaceHF_TOKEN20+ open models
AWS BedrockIAM / aws configureEnterprise-grade
NVIDIA NIMAPI keyNemotron models
GitHub CopilotOAuth / tokenCopilot subscription models
Custom EndpointBase URL + keyVLLM, SGLang, Ollama, etc.

Setting API keys

monoclaw config set OPENROUTER_API_KEY sk-or-...
monoclaw config set ANTHROPIC_API_KEY sk-ant-...
monoclaw config set OPENAI_API_KEY sk-...

Secrets are stored in ~/.monoclaw/.env.

Minimum context requirement

Mona requires 64,000 tokens of context minimum. Models with smaller windows cannot maintain enough working memory for multi-step tool-calling workflows and will be rejected at startup.

Most hosted models meet this easily:

  • Claude: 200K–1M tokens
  • GPT-5.5: 1M tokens
  • Gemini: 1M tokens
  • Qwen3: 262K tokens
  • DeepSeek V4: 1M tokens

If you're running a local model, set its context size to at least 64K:

# llama.cpp
./server --ctx-size 65536

# Ollama
ollama run llama3 --ctx-size 65536

Custom endpoints

For self-hosted models (VLLM, SGLang, Ollama, llama.cpp):

monoclaw config set model.custom.endpoint "http://localhost:8000/v1"
monoclaw config set model.custom.api_key "sk-local"
monoclaw config set model.custom.name "local-llama"

Or in config.yaml:

model:
  default: "custom"
  custom:
    endpoint: "http://localhost:8000/v1"
    api_key: "${LOCAL_API_KEY}"
    name: "local-llama"
    context_length: 65536

Provider routing

Configure fallback providers if your primary is unavailable:

model:
  default: "anthropic/claude-sonnet-4"
  fallback:
    - "openai/gpt-5.5"
    - "openrouter/openai/gpt-4o"

Mona will automatically retry with the fallback provider if the primary fails.

Credential pools

For high-volume deployments, distribute requests across multiple API keys:

model:
  provider: openrouter
  credential_pool:
    - key: "${OPENROUTER_KEY_1}"
    - key: "${OPENROUTER_KEY_2}"
    - key: "${OPENROUTER_KEY_3}"

Keys are rotated round-robin.

Model aliases

Create short aliases for frequently used models:

model:
  aliases:
    fast: "openai/gpt-5.4-mini"
    smart: "anthropic/claude-opus-4.7"
    cheap: "deepseek/deepseek-v4-flash"

Use them in chat:

/model fast

Verifying your setup

monoclaw doctor

This checks:

  • API key validity
  • Model availability
  • Context length compatibility
  • Provider endpoint reachability