Best Open Source Models for OpenClaw

GLM-5 and MiniMax M2.5 are the best open source models for running OpenClaw. This guide covers setup, pricing, risks of using Claude Code or Gemini CLI subscriptions, and why these two models beat the rest.

Best Open Source Models for OpenClaw

I’ve been running OpenClaw for a while now and tried a bunch of different models with it. After swapping between providers and watching my API bills, two models stood out: GLM-5 and MiniMax M2.5.

Below I’ll go through why I landed on these two, how to set them up, and why routing your Claude Code or Gemini CLI subscription through OpenClaw is a bad idea.

Subscription Risk Warning

Using your Claude Code, Gemini CLI, or ChatGPT/Codex subscription OAuth tokens with OpenClaw can get your account banned. Anthropic, Google, and OpenAI monitor for automated usage patterns that fall outside normal CLI use. Stick with API keys instead.

Why Open Source Models Make Sense for OpenClaw

OpenClaw runs 24/7 on your server. It handles messages, runs scheduled jobs, and executes skills nonstop. That kind of continuous usage gets expensive with proprietary models.

Open source models through API providers give you:

  • Predictable costs: Pay per token, no surprise subscription overages
  • No ban risk: API access is designed for automated use
  • Model flexibility: Swap models anytime through config changes
  • Better rate limits: API tiers usually offer higher throughput than subscription OAuth

If you’re new to OpenClaw, our setup guide walks through the full installation process. For a broader look at alternatives, check out our OpenClaw alternatives roundup.

The Risks of Using Claude Code or Gemini CLI Subscriptions

Let me get this out of the way first because I see people asking about it constantly.

OpenClaw supports OAuth tokens from Claude Code, Gemini CLI, and OpenAI Codex. This means you can technically use your $20/month Claude Pro subscription or your Google AI subscription to power OpenClaw instead of paying for API credits. It works. But you’re playing with fire.

Why you can get banned

Anthropic, Google, and OpenAI all have terms of service that restrict how their subscription tokens can be used. When you route a Claude Code OAuth token through OpenClaw, here’s what happens:

  • Usage patterns change: Normal Claude Code use looks like a developer typing in a terminal. OpenClaw sends automated requests around the clock, often in bursts when scheduled jobs fire
  • Token volume spikes: An always-on assistant burns through more tokens than a human coding session
  • Rate limit abuse: OpenClaw may retry failed requests or send parallel calls that look like scraping
  • IP and fingerprinting: Requests from a VPS data center look different from a laptop on residential internet

What happens when you get banned

PlatformConsequenceRecovery
Claude CodeAccount suspended, subscription cancelledAppeal process exists but no guarantee
Gemini CLIGoogle account flagged, API access revokedCould affect other Google services
OpenAI CodexAccount banned, subscription terminatedLimited appeal options

Real Risk

People have had their accounts suspended within days of routing subscription tokens through automated tools. The detection is getting better, not worse. Your Claude Max subscription is not worth losing over $15 in saved API costs.

What to do instead

Use API keys. That’s it. Every provider offers pay-as-you-go API access that’s designed for automated usage:

Or better yet, use open source models where this isn’t even a concern. Which brings us to the actual recommendations.

1. GLM-5: The one I use for serious work

GLM-5 from Z.AI is the strongest open source model I’ve tested. It reasons well, barely hallucinates, and can actually handle multi-step agent tasks without falling apart halfway through.

Why GLM-5 works for OpenClaw

OpenClaw isn’t just a chatbot. It plans tasks, calls tools, runs scripts, and keeps context across long conversations. GLM-5 handles that kind of work reliably.

  • 95.8% SWE-bench Verified: The highest coding score among open source models, which matters when OpenClaw runs code on your server
  • Near-zero hallucinations: Scores -1 on AA-Omniscience Index. When your assistant runs commands on a live server, you want it to be right
  • Agentic task handling: ELO 1,412 on GDPval-AA — only Claude Opus 4.6 and GPT-5.2 score higher
  • 200K context window: Enough for OpenClaw’s conversation history plus tool outputs
  • MIT license: No usage restrictions, fully compatible with automated workflows

Technical specs

FeatureGLM-5
Total Parameters744B (MoE)
Active Parameters40B
Context Length200K tokens
ArchitectureMoE with Sparse Attention
Input Cost$0.80/M tokens
Output Cost$2.56/M tokens
LicenseMIT
GLM-5 Coding Plans

Setting up GLM-5 with OpenClaw

Edit your OpenClaw config to use GLM-5:

openclaw configure --section models

Or edit the config file directly:

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "z-ai/glm-5",
        "fallback": ["minimax/m2.5-lightning"]
      }
    }
  }
}

Then restart the gateway:

openclaw gateway restart

GLM-5 is available through OpenRouter and directly through the Z.AI API. For OpenClaw, I’d recommend using the Z.AI coding plans since they’re priced for developer workloads.

GLM Coding Plans

Z.AI offers GLM Coding Plans with pricing designed for developers running continuous workloads like OpenClaw.

Where GLM-5 shines in OpenClaw

  • Scheduled tasks: Morning briefings, server monitoring, cron jobs that need to actually work
  • Coding skills: When you tell OpenClaw to run scripts or manage Docker containers, GLM-5’s coding accuracy matters
  • Research tasks: Pairs well with DuckDuckGo search integration for pulling and summarizing web results
  • Multi-step workflows: Complex skills that chain multiple tool calls together
  • Long conversations: Keeps context across extended sessions without losing the thread

2. MiniMax M2.5: The cheap one that’s surprisingly good

MiniMax M2.5 costs a fraction of GLM-5 and still scores 80.2% on SWE-bench Verified. For an always-on assistant, that price difference matters a lot over a month.

Why MiniMax M2.5 works for OpenClaw

Most OpenClaw interactions don’t need the absolute best model. Quick questions, reminders, file management, simple research. A cheaper model handles these fine. MiniMax M2.5 is cheap enough that you forget it’s costing anything.

  • 80.2% SWE-bench Verified: Beats Claude Opus 4.6 on Droid scaffold, more than enough for OpenClaw tasks
  • $0.15/M input tokens: The cheapest frontier model available. Running OpenClaw for a month might cost a few dollars
  • Two speed tiers: M2.5 at 50 tokens/second and Lightning at 100 tokens/second
  • 10+ programming languages: Go, C, C++, TypeScript, Rust, Python, Java, and more
  • Agent framework support: Works with Claude Code, Droid, Cline, and similar agent tools

Technical specs

FeatureMiniMax M2.5
ArchitectureMixture-of-Experts (MoE)
Context Length200K tokens
M2.5 Input Cost$0.15/M tokens (50 TPS)
M2.5 Output Cost$1.20/M tokens (50 TPS)
Lightning Input Cost$0.30/M tokens (100 TPS)
Lightning Output Cost$2.40/M tokens (100 TPS)
MiniMax Coding Plans (10% Off)

Setting up MiniMax M2.5 with OpenClaw

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "minimax/m2.5",
        "fallback": ["minimax/m2.5-lightning"]
      }
    }
  }
}

For faster responses (good for interactive chat), use the Lightning variant:

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "minimax/m2.5-lightning"
      }
    }
  }
}

Cost Breakdown

Running MiniMax M2.5 at 50 TPS continuously for an hour costs about $0.30. That means running OpenClaw 24/7 for a full month on MiniMax M2.5 would cost roughly $7-15 depending on actual usage. Compare that to $50-150/month with Claude API.

Where MiniMax M2.5 shines in OpenClaw

  • Always-on chat: Cheap enough that you don’t think twice about sending messages
  • Quick tasks: Reminders, file management, simple lookups
  • Coding assistance: 80.2% SWE-bench means it handles most coding skills reliably
  • Scheduled jobs: Morning briefings, server checks, cron tasks
  • Fallback model: Works well as a secondary model when your primary hits rate limits

GLM-5 vs MiniMax M2.5: head to head

Here’s how they compare for OpenClaw use:

FeatureGLM-5MiniMax M2.5
SWE-bench Verified95.8%80.2%
Input Cost$0.80/M tokens$0.15/M tokens
Output Cost$2.56/M tokens$1.20/M tokens
Context Length200K200K
Hallucination RateNear-zeroLow
SpeedStandard50-100 TPS
Best ForComplex tasks, coding, researchAlways-on chat, budget usage
Monthly Cost (estimated)$30-60$7-15

What I actually run

Use both. Set GLM-5 as primary and MiniMax M2.5-Lightning as fallback:

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "z-ai/glm-5",
        "fallback": ["minimax/m2.5-lightning"]
      }
    }
  }
}

GLM-5 does the hard stuff, MiniMax catches anything that fails or gets rate-limited. For simpler skills and scheduled tasks, you can also point specific agents at MiniMax directly to save money.

Setting up DuckDuckGo search with your models

Both GLM-5 and MiniMax M2.5 work with OpenClaw’s web search. If you haven’t set that up yet, our DuckDuckGo OpenClaw search guide walks through it.

Web search matters more than you’d think for an always-on assistant. Without it, OpenClaw is limited to whatever the model learned during training. With it, OpenClaw can grab current information when answering questions or running research tasks.

Cost comparison: open source vs subscriptions

Here’s what you’d actually spend per month running OpenClaw with different approaches:

ApproachMonthly CostBan RiskNotes
Claude Code OAuth$20-200 (subscription)HighTerms violation, account suspension risk
Gemini CLI OAuth$0-20 (subscription)HighCould affect your Google account
Claude API (Sonnet 4.5)$50-150NoneExpensive for 24/7 use
GLM-5 API$30-60NoneBest performance, reasonable cost
MiniMax M2.5 API$7-15NoneBest value, strong performance
GLM-5 + M2.5 combo$20-40NoneWhat I run

The API route costs less than a Claude subscription and you don’t risk losing your account. For most OpenClaw setups, the GLM-5 and MiniMax M2.5 combo is the obvious pick.

Tips from running this setup

A few things I learned the hard way:

Prompt tuning

GLM-5 and MiniMax M2.5 respond differently to the same prompt. GLM-5 does better with clear, structured instructions. MiniMax M2.5 handles casual requests fine but needs explicit formatting instructions when you want structured output.

Edit your ~/.openclaw/workspace/SOUL.md to include model-specific guidance:

When executing tasks:
- Break complex requests into clear steps
- Confirm before running destructive commands
- Use structured output for research results

Context management

Both models support 200K tokens, but shorter conversations give better results. Use /compact when things get long, or /new to start fresh.

Monitoring costs

Watch your API usage for the first week. OpenClaw’s scheduled jobs and background tasks burn more tokens than you’d guess. Check your provider dashboards:

Fallback configuration

Always set a fallback model. API providers go down. If GLM-5 is out at 3 AM and you have a morning briefing scheduled, MiniMax picks up the slack.

Frequently Asked Questions

Can I use both GLM-5 and MiniMax M2.5 at the same time?

Yes. Set one as primary and the other as fallback in your OpenClaw config. You can also configure different models for different agents or skills.

Will using my Claude subscription with OpenClaw definitely get me banned?

Not guaranteed, but the risk is real and growing. Anthropic actively monitors for automated usage through OAuth tokens. Several users have reported account suspensions. It’s not worth the risk when API access is available.

How much does it cost to run OpenClaw with MiniMax M2.5 for a month?

Depends on usage, but most people spend $7-15/month. Heavy users with lots of scheduled tasks and long conversations might hit $20-25. Still far cheaper than Claude API.

Can I switch models without restarting OpenClaw?

Yes. Use the /model command in your chat to switch models on the fly. For permanent changes, edit the config and run openclaw gateway restart.

Do these models support DuckDuckGo search in OpenClaw?

Yes. Both GLM-5 and MiniMax M2.5 work with OpenClaw’s DuckDuckGo search integration. The search tool provides web results that the model can use to answer questions with current information.

What about using local models with Ollama instead?

Local models are a solid option if you have the hardware. Our OpenClaw with Ollama guide covers hardware tiers, model recommendations (including Nanbeige4.1-3B for budget machines), and how to set up a hybrid config with local primary + API fallback. For most VPS setups, API-based models like GLM-5 and MiniMax M2.5 still perform better than what you can run locally.

At this point, there’s no good reason to risk your Claude or Gemini subscription on an automated tool. GLM-5 handles the complex stuff, MiniMax M2.5 keeps the bill low for everyday use. Set them up, configure your fallbacks, and stop worrying about ban emails.

Whichever model you use, make sure your OpenClaw instance is hardened first. The OpenClaw security guide covers CVE-2026-25253, 40+ patched vulnerabilities, and a step-by-step lockdown procedure.

For the full model breakdown including Kimi K2.5, Qwen-Max, and Devstral 2, check out our best open source LLMs for coding guide. If you’re still deciding between AI assistant platforms, our OpenClaw alternatives article covers the other options. For container-isolated agents with Claude, see our NanoClaw deploy guide. For the smallest possible Zig binary with 22+ providers, see the NullClaw deploy guide. And if you want visibility into what your agent is actually doing — sessions, costs, cron jobs — check out our best OpenClaw dashboards roundup.