MonoClaw

Modal

Modal is a serverless compute platform that lets you run Mona on GPUs and CPUs in the cloud, scaling from zero to thousands of workers.

When to use Modal

  • You need GPU acceleration for local model inference
  • You want ephemeral, isolated execution environments
  • You want to scale concurrent agent sessions horizontally
  • You don't want to manage servers

Installation

The Modal extra is included in the [all] install. If you installed the minimal bundle:

cd ~/.monoclaw/monoclaw-runtime
uv pip install -e ".[modal]"

Authentication

  1. Sign up at modal.com
  2. Install the Modal CLI: pip install modal
  3. Run modal token new to authenticate

Configure MonoClaw

Set Modal as your terminal backend:

monoclaw config set terminal.backend modal

Or configure in config.yaml:

terminal:
  backend: modal
  modal:
    app_name: "monoclaw-agent"
    gpu: "a10g"        # or "t4", "a100", "h100"
    cpu: 4
    memory: 16384      # MB
    timeout: 3600      # seconds

How it works

When Mona needs to run a command:

  1. MonoClaw spins up a Modal sandbox
  2. The command executes inside the sandbox
  3. Output streams back to Mona in real time
  4. The sandbox shuts down when idle (or stays warm if configured)

Cost optimization

Modal charges only for compute time. Tips to minimize costs:

  • Use keep_warm: 1 to keep one sandbox warm for fast responses
  • Use cheaper GPUs (t4) for light workloads
  • Set timeout low to prevent runaway processes
  • Use spot instances for non-critical tasks

Example: GPU-accelerated local model

Run a local model inside Modal:

model:
  default: "custom"
  custom:
    endpoint: "https://your-modal-app.modal.run/v1"
    api_key: "${MODAL_API_KEY}"

Deploy an OpenAI-compatible endpoint on Modal using vLLM or TGI, then point Mona at it.

Limitations

  • Cold start latency: 5–30 seconds for the first command if no sandbox is warm
  • Network: Sandboxes have internet access but no persistent local storage
  • Secrets: Pass secrets via environment variables, not command arguments