Klaus
· by Robbie

Best AI Models for OpenClaw: What to Use and What It Costs

We see every OpenRouter invoice across our customer base. The difference between someone paying $8/month in AI costs and someone paying $80 isn’t how much they use their agent. It’s which model they picked.

The pricing here comes from Anthropic and OpenRouter, not from us, and the usage patterns come from watching real customers figure this out the hard way.

The short answer: Claude Sonnet 4.6 ($3/$15 per million tokens) handles most tasks well. Pair it with a budget model like Gemini 3.1 Flash Lite or DeepSeek V3.1 for heartbeats and simple queries. Most users spend $5-30/month on AI tokens.

How OpenClaw Uses AI Models

OpenClaw is free and open source. The framework itself costs nothing to download or run. But every time your agent reads a file, drafts an email, searches the web, or checks whether it has pending tasks, it sends a request to an AI model and pays for tokens.

You pick which model your agent uses. OpenClaw connects to model providers through OpenRouter, which gives you access to 300+ models through a single API key. You can also connect directly to providers like Anthropic or OpenAI.

Here’s what consumes tokens on a running OpenClaw instance:

  • Conversation turns. Every message you send and every response your agent generates.
  • Tool calls. Each time the agent uses a tool (web search, file read, API call), the tool description, input, and output all cost tokens.
  • System prompt. OpenClaw assembles a system prompt on every request that includes your bootstrap files (AGENTS.md, TOOLS.md, MEMORY.md), skill descriptions, and workspace context. This can be 10,000-50,000 tokens per request.
  • Heartbeats. OpenClaw agents send periodic “still alive” checks every 30 minutes. Each one costs tokens.
  • Sub-agents. When your agent spawns parallel workers, each one runs on your primary model by default.

Model choice matters more than server size. Hosting is a fixed $5-50/month. AI tokens scale with usage and with which model you picked.

Model Pricing Comparison

All prices are per million tokens via OpenRouter or direct provider APIs. Prices change frequently. Check OpenRouter’s model catalog for current numbers.

Premium Tier

These models handle complex reasoning, multi-step planning, and nuanced writing. You pay for it.

ModelInput (per MTok)Output (per MTok)Context Window
Claude Opus 4.6$5.00$25.001M tokens
Claude Sonnet 4.6$3.00$15.001M tokens
GPT-5.2$1.75$14.00400K tokens
Gemini 3.1 Pro$2.00$12.001M tokens

Source: Anthropic pricing, OpenRouter models

Claude Sonnet 4.6 is where most of our customers land. It handles email automation, calendar management, research, and web browsing at roughly 60% of Opus’s cost.

Opus is for when you need it: complex multi-file coding, deep synthesis across long documents, or tasks where getting it slightly wrong costs more than the token difference. Running Opus for everything including heartbeats is like hiring a lawyer to check your mailbox. It works, but you notice when the bill arrives. Still, I use Opus for everything, as I have credits to spend and appreciate the reliability.

Mid-Tier

Strong performance at a fraction of premium pricing. These models handle daily work well.

ModelInput (per MTok)Output (per MTok)Context Window
GPT-5.4 Mini$0.75$4.50400K tokens
Claude Haiku 4.5$1.00$5.00200K tokens
Kimi K2.5$0.38$1.72262K tokens
DeepSeek R1$0.70$2.5064K tokens
GLM 5.1$1.00$3.20202K tokens

Source: OpenRouter models, Price Per Token community rankings

Kimi K2.5 and GLM 5.1 are the models the OpenClaw community has been most excited about, based on community voting data. Both offer strong coding and reasoning at a fraction of the cost of Sonnet. The trade-off: context windows are smaller and reliability on complex agentic workflows varies.

Claude Haiku 4.5 sits in a useful middle ground. Fast, cheap enough for high-volume routing, and capable enough that it doesn’t break on real tasks. Good for sorting, classification, and quick lookups.

Budget Tier

For heartbeats, simple queries, and any task where you need a pulse check, not a thesis.

ModelInput (per MTok)Output (per MTok)
Step 3.5 Flash$0.10$0.30
Gemini 3.1 Flash Lite$0.25$1.50
DeepSeek V3.1$0.14$0.50

Source: OpenRouter models

DeepSeek V3.1 at $0.14/$0.50 per million tokens is a fraction of Claude Opus’s $5/$25. For a heartbeat check that returns “nothing pending,” the quality difference is negligible. The cost difference is not.

What Does Each Task Actually Cost?

Common OpenClaw tasks range from fractions of a cent to about $0.14 per interaction, depending on the model. Per-token pricing tables don’t tell you much in isolation. Here’s what typical tasks cost across three models, based on token counts from the OpenClaw documentation.

TaskTypical Tokens (in / out)Sonnet 4.6Haiku 4.5DeepSeek V3
Email draft~2,000 / ~500~$0.014~$0.005~$0.0008
Web scrape + summary~10,000 / ~1,000~$0.045~$0.015~$0.003
Research query (multi-step)~20,000 / ~5,000~$0.14~$0.045~$0.008
Calendar lookup~1,000 / ~200~$0.006~$0.002~$0.0004
Heartbeat (per check)~500 / ~100~$0.003~$0.001~$0.0002
Web browsing session~75,000 / ~3,000~$0.27~$0.09~$0.02
Heartbeat (monthly, 48/day)~$4.32/mo~$1.44/mo~$0.29/mo

Estimates based on typical token counts. Actual costs vary with context length, tool outputs, and conversation history.

The heartbeat row is where model routing pays for itself. An agent running Sonnet for heartbeats spends ~$4.32/month on checks that produce no useful output. Switch heartbeats to a budget model and that drops to pennies.

A typical user sending 30 messages a day on Sonnet 4.6 spends about $15-25/month on AI tokens. The same usage pattern on Haiku 4.5 runs $5-8. On DeepSeek V3, around $1-2, though the quality difference shows on complex tasks.

The practical approach is not picking one model. It’s routing different tasks to different models.

How to Set Up Model Routing in OpenClaw

Model routing assigns different models to different task types, so your agent uses premium reasoning where it matters and cheap tokens where it doesn’t.

OpenClaw supports this through its configuration system. Here’s a simplified example of a tiered setup:

{
  "model": {
    "primary": "openrouter/anthropic/claude-sonnet-4-6",
    "fallbacks": ["openrouter/openai/gpt-5", "openrouter/google/gemini-3-flash"]
  }
}

This gives your agent a primary model with automatic fallback if the provider hits a rate limit or goes down. Without fallbacks, your agent stops working when Anthropic’s API has an outage.

For task-specific routing, you can assign different models to heartbeats, sub-agents, and other functions through the OpenClaw configuration. The cheapest capable model for heartbeats, a mid-tier model for sub-agents, and your best model for primary conversations.

OpenRouter also offers an auto-routing option: set your model to openrouter/openrouter/auto and it routes based on prompt complexity. Simple prompts go to cheap models, complex ones go to capable models. Less control than manual configuration, but zero setup.

The pattern we see with Klaus customers: they start on a single model (usually Sonnet), get their first monthly bill, then set up tiered routing. The ones who configure routing from the start spend 50-80% less.

Saving Money on OpenClaw AI Costs

Beyond model routing, a few other things make a real difference.

Prompt caching. Anthropic’s prompt caching lets you cache frequently used context (system prompts, large documents) so subsequent requests read from cache at 10% of the normal input price. If your agent has a large system prompt (and most do), caching pays for itself after a single cache hit.

Heartbeat timing. The OpenClaw docs recommend setting your heartbeat interval just under the cache TTL window. If caching uses a 1-hour window, set heartbeats to 55 minutes. This keeps the cache warm between conversations, so your next real interaction loads cached context instead of paying full input price. I think heartbeats are overdone, and I put everything into cron jobs instead.

Long conversations accumulate context, and you pay for the full context window on every request. Use /compact to summarize and reduce context length when conversations get long. Keep skill descriptions concise too, since they get injected into every request. Klaus exposes how many tokens your conversation has used directly in the chat UI. Once you get to over 150k tokens, I try to compact or start a fresh session.

If your agent processes screenshots, the default image resolution (1200px max dimension) might be higher than you need. Lowering imageMaxDimensionPx reduces vision token consumption for screenshot-heavy workflows.

At Klaus, the Starter plan includes $15 in AI credits (one-time) and Plus includes $30 (one-time). Pro includes $100/month in recurring credits. Plus and Pro also support BYOK (Bring Your Own Key) for OpenRouter, so you can manage your own billing directly.

Frequently Asked Questions

What is the cheapest AI model for OpenClaw?

There are many free models available. They change frequently — often a model is free just after release for a couple weeks.

Can I use free AI models with OpenClaw?

Yes. OpenRouter offers 25+ free models with rate limits (roughly 20 requests per minute, 200 per day). Good for testing your setup and experimenting with workflows. I don’t think free models are reliable enough for production use where your agent needs to respond consistently.

How much do OpenClaw AI tokens cost per month?

$5-30/month for most users. Light use (a few messages a day) on a budget model: under $5. Regular daily use on Claude Sonnet 4.6: $15-25. Heavy automation with a premium model and no routing: $50-150+. Model choice is the biggest variable, not usage volume. A single unmonitored automation running Opus can hit $100+ in a month.

Does Klaus include AI credits?

Starter ($19/mo) includes $15 in AI credits (one-time). Plus ($49/mo) includes $30 (one-time). Pro ($200/mo) includes $100 per month in recurring credits. Plus and Pro support BYOK (Bring Your Own Key) for OpenRouter, so you can manage model selection and billing independently.

Which model does Klaus use by default?

Klaus instances come pre-configured with Claude Sonnet as the primary model. Customers can change this to any model available through OpenRouter. We recommend Sonnet as the starting point because it handles the widest range of tasks at a reasonable cost.

Key Takeaways

  • Claude Sonnet 4.6 ($3/$15 per MTok) is a strong all-around choice for OpenClaw, handling most daily tasks at roughly 60% of Opus’s cost.
  • Model routing (different models for different tasks) cuts AI costs by 50-80% with no quality loss on the tasks that matter.
  • Heartbeats are a hidden cost driver. Switching heartbeats from a premium model to a budget model saves $4+ per month with zero quality impact.
  • The cheapest capable models (DeepSeek V3.1, Gemini 3.1 Flash Lite) cost under $1/month for heartbeats and simple tasks.
  • Most OpenClaw users spend $5-30/month on AI tokens. The biggest variable is model selection, not usage volume.
  • AI model pricing changes fast. Check the OpenRouter model catalog for current prices before making decisions.
  • Configure fallback models so your agent keeps working when a provider has an outage.

Want to try different models without managing API keys and routing yourself? Sign up at klausai.com.

For a full breakdown of OpenClaw costs including hosting and infrastructure, see our pricing breakdown. If you’re deciding between managed and self-hosted, start with our managed vs self-hosted comparison.

Sources

  • Anthropic. “Pricing.” Official Claude model pricing including prompt caching and batch API discounts. Accessed April 2026.
  • OpenRouter. “Pricing.” Platform pricing structure and model pass-through rates. Accessed April 2026.
  • OpenRouter. “Models.” Full catalog of 300+ available AI models with per-token pricing. Accessed April 2026.
  • OpenRouter. “Integration with OpenClaw.” Setup guide for connecting OpenRouter to OpenClaw. Accessed April 2026.
  • OpenClaw Documentation. “Token Use and Costs.” Official documentation on token consumption patterns, system prompt assembly, and cost optimization. Accessed April 2026.
  • OpenClaw Documentation. “Configuration Examples.” Model routing, fallback configuration, and task-specific model assignment. Accessed April 2026.
  • Price Per Token. “Best Model for OpenClaw.” Community-voted model rankings with pricing data for 521 tracked models. Accessed April 2026.
  • Klaus. Klaus pricing page. Self-reported tier pricing. Accessed April 2026.