Choosing Models for OpenClaw

How OpenClaw Uses Models

Your OpenClaw agent needs a "brain" — a large language model (LLM) that processes messages and generates responses. The gateway routes every incoming message (from Telegram, Discord, or web) to your configured model provider, gets a response, and sends it back.

You choose which model to use, and you can change it anytime. Different models have different capabilities, speeds, and costs.

The Default: Moonshot / Kimi

OpenClaw defaults to Moonshot's Kimi models. Here's why:

Why Moonshot?

Free tier available — enough for personal use and experimenting
Large context window — Kimi K2.5 supports 256K tokens (most models cap at 128K)
Good general performance — handles conversation, coding, analysis, and creative tasks
Low latency — fast response times for interactive chat

Available Kimi Models

| Model | Context Window | Max Output | Best For | |-------|---------------|------------|----------| | Kimi K2.5 | 256K tokens | 8,192 tokens | General tasks, coding, conversation | | Kimi K2 (preview) | 256K tokens | 8,192 tokens | Fallback option |

Getting a Moonshot API Key

Go to platform.moonshot.ai
Sign up for a free account
Navigate to API Keys
Generate a new key
Add it during openclaw setup or manually in your agent config

Other Supported Providers

OpenClaw isn't locked to one provider. You can configure any OpenAI-compatible API:

Claude (Anthropic)

The most capable coding and reasoning model. Significantly better than Kimi for complex tasks.

| Model | Context | Cost (per 1M tokens) | |-------|---------|---------------------| | Claude Haiku 4.5 | 200K | $0.80 in / $4 out | | Claude Sonnet 4.6 | 200K | $3 in / $15 out | | Claude Opus 4.6 | 200K | $15 in / $75 out |

When to use: Complex coding tasks, multi-step reasoning, working with large codebases.

Setup:

{
  "providers": {
    "anthropic": {
      "baseUrl": "https://api.anthropic.com/v1",
      "apiKey": "sk-ant-...",
      "models": [
        {
          "id": "claude-sonnet-4-6",
          "name": "Claude Sonnet",
          "contextWindow": 200000,
          "maxTokens": 8192
        }
      ]
    }
  }
}

GPT (OpenAI)

Broad ecosystem, good at creative tasks and general knowledge.

| Model | Context | Cost (per 1M tokens) | |-------|---------|---------------------| | GPT-4.1 | 1M | $2 in / $8 out | | GPT-4.1 mini | 1M | $0.40 in / $1.60 out | | GPT-4.1 nano | 1M | $0.10 in / $0.40 out |

When to use: Creative writing, broad knowledge queries, cost-sensitive applications.

DeepSeek

Strong open-weight models with excellent coding performance at a fraction of the cost.

| Model | Context | Cost (per 1M tokens) | |-------|---------|---------------------| | DeepSeek V3 | 128K | $0.27 in / $1.10 out | | DeepSeek R1 | 128K | $0.55 in / $2.19 out |

When to use: Cost-sensitive coding tasks, reasoning-heavy problems (R1), or as a cheap daily-driver model.

Setup: Use any OpenAI-compatible config with baseUrl: "https://api.deepseek.com/v1".

Qwen (Alibaba)

Another strong open-weight family. Available via API or run locally.

| Model | Context | Cost (per 1M tokens) | |-------|---------|---------------------| | Qwen 2.5 72B (via API) | 128K | ~$0.90 in / $0.90 out | | Qwen 2.5 (local via Ollama) | 32-128K | Free (hardware cost) |

When to use: Budget-friendly alternative, privacy-sensitive tasks (local), or if you want a non-US provider.

Local Models (Ollama)

Run models on your own hardware. Zero API costs, complete privacy.

# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Pull a model
ollama pull qwen2.5:32b    # or llama3.1, deepseek-r1, etc.

# Configure in OpenClaw
# baseUrl: http://localhost:11434/v1

When to use: Privacy-sensitive tasks, offline operation, experimenting without API costs.

Trade-off: Local models need serious hardware (32GB+ RAM for good models). Good for simple tasks on modest hardware, competitive with cloud APIs on high-end machines.

Configuring Models

Main config: openclaw.json

Your model configuration lives in ~/.openclaw/openclaw.json under the models section:

{
  "models": {
    "providers": {
      "moonshot": {
        "baseUrl": "https://api.moonshot.ai/v1",
        "models": [
          {
            "id": "kimi-k2.5",
            "name": "Kimi K2.5",
            "contextWindow": 256000,
            "maxTokens": 8192
          }
        ]
      }
    }
  }
}

Per-agent model assignment

Each agent can use a different model. This is configured in the agent's directory:

~/.openclaw/agents/main/agent/models.json    # Main agent's model config
~/.openclaw/agents/kimi-coding/agent/models.json  # Secondary agent

This means you can have:

A main agent running Claude (for complex tasks)
A coding agent running Kimi (for quick code generation)
A research agent running GPT-4o mini (for cheap, broad queries)

Setting the default model

In openclaw.json, under agents.defaults:

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "moonshot/kimi-k2.5",
        "fallbacks": [
          "moonshot/kimi-k2-0905-preview"
        ]
      }
    }
  }
}

The fallback chain means if the primary model is down or rate-limited, OpenClaw automatically tries the next one.

Cost-Saving Tips

Use cheap models for background tasks

Your agent doesn't need its best model for everything. Configure cheaper models for:

Heartbeat checks — periodic wake-ups that just scan for updates
Cron jobs — scheduled tasks like daily summaries
Group chats — where responses don't need to be as sophisticated

Monitor your usage

Check your API provider's dashboard regularly. Most providers show:

Total tokens used
Cost per day/week/month
Rate limit status

The free tier strategy

Start with Moonshot's free tier. It's enough for:

Personal conversations
Simple automation
Learning the system

Upgrade to a paid provider only when you need:

Better code generation (→ Claude)
Longer responses (→ increase maxTokens)
Higher rate limits (→ paid tier of any provider)

Model switching

You can change models at any time without losing memory or configuration. Your agent's personality (SOUL.md, IDENTITY.md) is stored as files, not in the model. Switching from Kimi to Claude just means your agent gets a smarter brain — same personality, same memory.

Quick Comparison

| Factor | DeepSeek V3 | Claude Sonnet 4.6 | GPT-4.1 | Qwen 2.5 (local) | |--------|------------|-------------------|---------|------------------| | Cost | ~$0.27-1.10/M | ~$3-15/M tokens | ~$2-8/M tokens | Free (hardware cost) | | Quality | Very Good | Excellent | Very Good | Good-Very Good | | Speed | Fast | Fast | Fast | Depends on hardware | | Context | 128K | 200K | 1M | 32-128K | | Privacy | Cloud | Cloud | Cloud | Local only | | Best for | Budget coding | Complex coding | Large context | Offline/privacy |

Our recommendation: Start with DeepSeek V3 for an affordable daily driver, upgrade to Claude Sonnet when you're serious about complex coding tasks, and keep a local Ollama model as an offline fallback.

Next: Learn what each workspace file does in The .md Files That Define Your Agent.

→ Ask the index what to build your openclaw stack

→ Free credits for these tools

Written by McKlaud AI. Want to know which AI tools actually fit your business? Get a free AI audit.