Stop burning money on API calls — learn to read your dashboard, spot waste, and optimize like a pro
Turn what you learned into a concrete stack decision.
01AI Agent ToolboxBuild autonomous AI agents that can read, write, and actWant the shortlist in your inbox?
Subscribe for the weekly brief that turns new AI noise into the few tools and workflows worth testing.
Curated bundles that help you move from this guide into a working stack.
Guide
How to Build an AI Agent That Makes Money
From toy project to revenue — practical monetization patterns for autonomous AI agents
Guide
Claude Code + Agents: Advanced Configuration and Autonomous Workflows
CLAUDE.md architecture, subagents, hooks, MCP servers, and building Claude pipelines that run without you
Guide
Claude Cowork + Claude Agents: Your First Setup
How to use Claude as a real AI coworker — not just a chat window — with agents that take action on your behalf
Most AI builders launch an agent, celebrate when it works, then get a surprise bill at the end of the month. The Anthropic Console dashboard tells you exactly where your money is going — if you know how to read it.
This guide walks you through every panel on the dashboard, teaches you what the numbers actually mean, and shows you the optimization moves that can cut your costs by 50-90%.
This is your command center. Six numbers that tell the whole story at a glance:
| Metric | What It Means | What to Watch For | |--------|--------------|-------------------| | Messages | Total API calls (user + assistant turns) | If this is way higher than expected, something is looping | | Tool Calls | How many times your agent used tools | High tool calls = high token usage (each call adds context) | | Errors | Failed API calls | Above 5% means something is broken — fix it before optimizing cost | | Avg Tokens / Msg | Average input+output tokens per message | This is your main cost lever. Lower = cheaper | | Avg Cost / Msg | Dollar cost per API call | Multiply by daily message count = your monthly bill | | Sessions | Distinct conversations | More sessions with smaller context > fewer sessions with huge context |
The number that matters most: Avg Cost / Msg. If you're on Haiku at $0.008/msg, that's roughly $0.24/day for 30 messages — very healthy. If you're on Sonnet at $0.08/msg, same volume costs $2.40/day. Know your model, know your rate.
Three gauges below the overview:
How fast tokens are flowing. Useful for debugging latency issues. If your agent feels slow, check here — you might be hitting rate limits rather than having a code problem.
Percentage of failed calls. Common causes:
Target: under 1%. If you're above 3%, stop and fix errors before doing anything else. Every error is wasted money (you pay for the input tokens even when it fails).
This is the money metric. When Anthropic caches your prompt prefix, you pay 90% less for those cached tokens.
How to get high cache rates:
The line charts show your token usage over time against your rate limits. Two charts per model: Input Tokens and Output Tokens.
What a healthy chart looks like: Blue line stays well below the red limit. Green line (cache rate) stays high (above 80%).
What a problem looks like: Blue line touching or exceeding the red limit = you're being rate-limited. Your agent is waiting in queue, which means slower responses and sometimes errors.
Output tokens are 3-5x more expensive than input tokens. If your output line is climbing steeply, your agent is generating long responses. Consider:
max_tokens to cap response lengthShows which models are costing you money. If you see Sonnet handling tasks that Haiku could do, that's instant savings. Rule of thumb:
Usually just "anthropic" unless you're routing through a gateway like OpenRouter. Check that you're not accidentally double-paying through a middleman.
Which tools your agent calls most. Each tool call adds tokens (the tool definition, the call, the result). If one tool dominates:
If you run multiple agents, this shows which one is the money pit. Common pattern: your "main" agent costs 10x more than helpers because it holds all the context.
Fix: Break the main agent into focused sub-agents that each hold only the context they need.
Where API calls originate (WhatsApp, web, API direct, etc.). Useful for understanding which product surface drives the most cost.
Structure your system prompt so it's identical across calls. Anthropic auto-caches prompt prefixes that are reused. A 99% cache hit rate means you're paying $0.025/M instead of $0.25/M for input tokens on Haiku.
Don't use Sonnet for everything. Route simple tasks to Haiku:
User asks simple question → Haiku ($0.008/msg)
User asks complex question → Sonnet ($0.08/msg)
A basic classifier (even a regex) saves 80% on routine messages.
Every token in your context window costs money on every call. Trim aggressively:
Output tokens cost 5x more than input tokens. Control them:
max_tokens appropriately (don't leave it at 4096 if you need 200)Every error wastes the full input cost with zero value. Fix errors first, optimize second. A 3% error rate on 1000 daily messages = 30 wasted calls/day.
Quick formula to estimate your monthly spend:
Monthly Cost = (daily_messages) x (avg_cost_per_msg) x 30
| Daily Messages | Haiku ($0.008/msg) | Sonnet ($0.08/msg) | |---------------|--------------------|--------------------| | 30 | $7.20/mo | $72/mo | | 100 | $24/mo | $240/mo | | 500 | $120/mo | $1,200/mo | | 1000 | $240/mo | $2,400/mo |
These assume good caching. Without caching, multiply by 3-5x.
The difference between a $50/month AI agent and a $500/month one is usually not the features — it's whether the builder knows how to read this dashboard.