Configuring Models

The model you choose determines Mona's capabilities, cost, and latency. MonoClaw supports 20+ providers and any OpenAI-compatible endpoint.

Quick setup

monoclaw model

This interactive wizard walks you through provider selection, model choice, and API key configuration.

Supported providers

Provider	Setup	Notes
OpenRouter	API key	Multi-provider routing, free tier available
Anthropic	API key	Claude models, high quality
OpenAI	API key	GPT models, Codex
DeepSeek	API key	Cost-effective, strong reasoning
Kimi / Moonshot	API key	Coding specialist
Alibaba Cloud	API key	Qwen models
Hugging Face	HF_TOKEN	20+ open models
AWS Bedrock	IAM / `aws configure`	Enterprise-grade
NVIDIA NIM	API key	Nemotron models
GitHub Copilot	OAuth / token	Copilot subscription models
Custom Endpoint	Base URL + key	VLLM, SGLang, Ollama, etc.

Setting API keys

monoclaw config set OPENROUTER_API_KEY sk-or-...
monoclaw config set ANTHROPIC_API_KEY sk-ant-...
monoclaw config set OPENAI_API_KEY sk-...

Secrets are stored in ~/.monoclaw/.env.

Minimum context requirement

Mona requires 64,000 tokens of context minimum. Models with smaller windows cannot maintain enough working memory for multi-step tool-calling workflows and will be rejected at startup.

Most hosted models meet this easily:

Claude: 200K–1M tokens
GPT-5.5: 1M tokens
Gemini: 1M tokens
Qwen3: 262K tokens
DeepSeek V4: 1M tokens

If you're running a local model, set its context size to at least 64K:

# llama.cpp
./server --ctx-size 65536

# Ollama
ollama run llama3 --ctx-size 65536

Custom endpoints

For self-hosted models (VLLM, SGLang, Ollama, llama.cpp):

monoclaw config set model.custom.endpoint "http://localhost:8000/v1"
monoclaw config set model.custom.api_key "sk-local"
monoclaw config set model.custom.name "local-llama"

Or in config.yaml:

model:
  default: "custom"
  custom:
    endpoint: "http://localhost:8000/v1"
    api_key: "${LOCAL_API_KEY}"
    name: "local-llama"
    context_length: 65536

Provider routing

Configure fallback providers if your primary is unavailable:

model:
  default: "anthropic/claude-sonnet-4"
  fallback:
    - "openai/gpt-5.5"
    - "openrouter/openai/gpt-4o"

Mona will automatically retry with the fallback provider if the primary fails.

Credential pools

For high-volume deployments, distribute requests across multiple API keys:

model:
  provider: openrouter
  credential_pool:
    - key: "${OPENROUTER_KEY_1}"
    - key: "${OPENROUTER_KEY_2}"
    - key: "${OPENROUTER_KEY_3}"

Keys are rotated round-robin.

Model aliases

Create short aliases for frequently used models:

model:
  aliases:
    fast: "openai/gpt-5.4-mini"
    smart: "anthropic/claude-opus-4.7"
    cheap: "deepseek/deepseek-v4-flash"

Use them in chat:

/model fast

Verifying your setup

monoclaw doctor

This checks:

API key validity
Model availability
Context length compatibility
Provider endpoint reachability