Which LLM to use as your agent's brain — free options, paid upgrades, and cost-saving tricks
Turn what you learned into a concrete stack decision.
01AI Agent ToolboxBuild autonomous AI agents that can read, write, and actWant the shortlist in your inbox?
Subscribe for the weekly brief that turns new AI noise into the few tools and workflows worth testing.
Curated bundles that help you move from this guide into a working stack.
Guide
Installing OpenClaw: Zero to Running Agent
Get your own AI agent with personality, memory, and Telegram access in under 30 minutes
Guide
The .md Files That Define Your Agent
IDENTITY, SOUL, MEMORY, and the other files that make OpenClaw agents unique
Guide
OpenClaw Infrastructure: Dedicated Machine vs. Cloud VPS (Advanced)
LaunchAgents, PM2, systemd, SSH tunnels — build a production-grade OpenClaw deployment that survives reboots and network drops
Your OpenClaw agent needs a "brain" — a large language model (LLM) that processes messages and generates responses. The gateway routes every incoming message (from Telegram, Discord, or web) to your configured model provider, gets a response, and sends it back.
You choose which model to use, and you can change it anytime. Different models have different capabilities, speeds, and costs.
OpenClaw defaults to Moonshot's Kimi models. Here's why:
| Model | Context Window | Max Output | Best For | |-------|---------------|------------|----------| | Kimi K2.5 | 256K tokens | 8,192 tokens | General tasks, coding, conversation | | Kimi K2 (preview) | 256K tokens | 8,192 tokens | Fallback option |
openclaw setup or manually in your agent configOpenClaw isn't locked to one provider. You can configure any OpenAI-compatible API:
The most capable coding and reasoning model. Significantly better than Kimi for complex tasks.
| Model | Context | Cost (per 1M tokens) | |-------|---------|---------------------| | Claude Haiku 4.5 | 200K | $0.80 in / $4 out | | Claude Sonnet 4.6 | 200K | $3 in / $15 out | | Claude Opus 4.6 | 200K | $15 in / $75 out |
When to use: Complex coding tasks, multi-step reasoning, working with large codebases.
Setup:
{
"providers": {
"anthropic": {
"baseUrl": "https://api.anthropic.com/v1",
"apiKey": "sk-ant-...",
"models": [
{
"id": "claude-sonnet-4-6",
"name": "Claude Sonnet",
"contextWindow": 200000,
"maxTokens": 8192
}
]
}
}
}
Broad ecosystem, good at creative tasks and general knowledge.
| Model | Context | Cost (per 1M tokens) | |-------|---------|---------------------| | GPT-4.1 | 1M | $2 in / $8 out | | GPT-4.1 mini | 1M | $0.40 in / $1.60 out | | GPT-4.1 nano | 1M | $0.10 in / $0.40 out |
When to use: Creative writing, broad knowledge queries, cost-sensitive applications.
Strong open-weight models with excellent coding performance at a fraction of the cost.
| Model | Context | Cost (per 1M tokens) | |-------|---------|---------------------| | DeepSeek V3 | 128K | $0.27 in / $1.10 out | | DeepSeek R1 | 128K | $0.55 in / $2.19 out |
When to use: Cost-sensitive coding tasks, reasoning-heavy problems (R1), or as a cheap daily-driver model.
Setup: Use any OpenAI-compatible config with baseUrl: "https://api.deepseek.com/v1".
Another strong open-weight family. Available via API or run locally.
| Model | Context | Cost (per 1M tokens) | |-------|---------|---------------------| | Qwen 2.5 72B (via API) | 128K | ~$0.90 in / $0.90 out | | Qwen 2.5 (local via Ollama) | 32-128K | Free (hardware cost) |
When to use: Budget-friendly alternative, privacy-sensitive tasks (local), or if you want a non-US provider.
Run models on your own hardware. Zero API costs, complete privacy.
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Pull a model
ollama pull qwen2.5:32b # or llama3.1, deepseek-r1, etc.
# Configure in OpenClaw
# baseUrl: http://localhost:11434/v1
When to use: Privacy-sensitive tasks, offline operation, experimenting without API costs.
Trade-off: Local models need serious hardware (32GB+ RAM for good models). Good for simple tasks on modest hardware, competitive with cloud APIs on high-end machines.
Your model configuration lives in ~/.openclaw/openclaw.json under the models section:
{
"models": {
"providers": {
"moonshot": {
"baseUrl": "https://api.moonshot.ai/v1",
"models": [
{
"id": "kimi-k2.5",
"name": "Kimi K2.5",
"contextWindow": 256000,
"maxTokens": 8192
}
]
}
}
}
}
Each agent can use a different model. This is configured in the agent's directory:
~/.openclaw/agents/main/agent/models.json # Main agent's model config
~/.openclaw/agents/kimi-coding/agent/models.json # Secondary agent
This means you can have:
In openclaw.json, under agents.defaults:
{
"agents": {
"defaults": {
"model": {
"primary": "moonshot/kimi-k2.5",
"fallbacks": [
"moonshot/kimi-k2-0905-preview"
]
}
}
}
}
The fallback chain means if the primary model is down or rate-limited, OpenClaw automatically tries the next one.
Your agent doesn't need its best model for everything. Configure cheaper models for:
Check your API provider's dashboard regularly. Most providers show:
Start with Moonshot's free tier. It's enough for:
Upgrade to a paid provider only when you need:
You can change models at any time without losing memory or configuration. Your agent's personality (SOUL.md, IDENTITY.md) is stored as files, not in the model. Switching from Kimi to Claude just means your agent gets a smarter brain — same personality, same memory.
| Factor | DeepSeek V3 | Claude Sonnet 4.6 | GPT-4.1 | Qwen 2.5 (local) | |--------|------------|-------------------|---------|------------------| | Cost | ~$0.27-1.10/M | ~$3-15/M tokens | ~$2-8/M tokens | Free (hardware cost) | | Quality | Very Good | Excellent | Very Good | Good-Very Good | | Speed | Fast | Fast | Fast | Depends on hardware | | Context | 128K | 200K | 1M | 32-128K | | Privacy | Cloud | Cloud | Cloud | Local only | | Best for | Budget coding | Complex coding | Large context | Offline/privacy |
Our recommendation: Start with DeepSeek V3 for an affordable daily driver, upgrade to Claude Sonnet when you're serious about complex coding tasks, and keep a local Ollama model as an offline fallback.
Next: Learn what each workspace file does in The .md Files That Define Your Agent.