Batch Processing

Batch processing lets you run Mona on multiple inputs in parallel. This is useful for data processing, evaluation, and generating training datasets.

Basic batch job

monoclaw batch run \
  --input data/questions.jsonl \
  --prompt "Answer the following question concisely:" \
  --output results/answers.jsonl

Input format (JSONL):

{"id": 1, "question": "What is the capital of France?"}
{"id": 2, "question": "Who wrote 1984?"}
{"id": 3, "question": "What is the speed of light?"}

Output format (JSONL):

{"id": 1, "answer": "Paris", "tokens_used": 45}
{"id": 2, "answer": "George Orwell", "tokens_used": 38}
{"id": 3, "answer": "299,792,458 m/s", "tokens_used": 52}

Parallel processing

Control concurrency:

monoclaw batch run \
  --input data/questions.jsonl \
  --prompt "Answer concisely" \
  --output results/answers.jsonl \
  --workers 10

Checkpointing

Resume interrupted batches:

monoclaw batch run \
  --input data/questions.jsonl \
  --prompt "Answer concisely" \
  --output results/answers.jsonl \
  --checkpoint checkpoints/batch-001.json

If the batch is interrupted, rerun with the same checkpoint file to resume.

Toolset distributions

Run different toolsets for different inputs:

# batch-config.yaml
toolsets:
  default: [core, web]
  code_questions: [core, code_execution]
  research_questions: [core, web, browser]

Batch evaluation

Evaluate model performance:

monoclaw batch eval \
  --input data/test-set.jsonl \
  --prompt "Answer the question" \
  --metric exact_match \
  --ground-truth data/answers.jsonl

Generating training data

Batch processing is ideal for generating RL training datasets:

monoclaw batch run \
  --input data/prompts.jsonl \
  --prompt "Solve the following problem step by step" \
  --output data/trajectories.jsonl \
  --save-tool-calls true

Configuration

# ~/.monoclaw/config.yaml
batch:
  default_workers: 5
  max_workers: 50
  checkpoint_interval: 100
  timeout: 300

Best practices

Start small — Test with 10 inputs before running 10,000
Use checkpointing — Batches can take hours; protect against crashes
Monitor costs — Parallel workers multiply API usage
Validate output — Check a sample of results before using the full batch
Use appropriate toolsets — Don't give web access to code-only tasks