Best Open Source Models for OpenClaw

I’ve been running OpenClaw for a while now and tried a bunch of different models with it. After swapping between providers and watching my API bills, two models stood out: GLM-5 and MiniMax M2.5.

Below I’ll go through why I landed on these two, how to set them up, and why routing your Claude Code or Gemini CLI subscription through OpenClaw is a bad idea.

Subscription Risk Warning

Using your Claude Code, Gemini CLI, or ChatGPT/Codex subscription OAuth tokens with OpenClaw can get your account banned. Anthropic, Google, and OpenAI monitor for automated usage patterns that fall outside normal CLI use. Stick with API keys instead.

Why Open Source Models Make Sense for OpenClaw

OpenClaw runs 24/7 on your server. It handles messages, runs scheduled jobs, and executes skills nonstop. That kind of continuous usage gets expensive with proprietary models.

Open source models through API providers give you:

Predictable costs: Pay per token, no surprise subscription overages
No ban risk: API access is designed for automated use
Model flexibility: Swap models anytime through config changes
Better rate limits: API tiers usually offer higher throughput than subscription OAuth

If you’re new to OpenClaw, our setup guide walks through the full installation process. For a broader look at alternatives, check out our OpenClaw alternatives roundup.

The Risks of Using Claude Code or Gemini CLI Subscriptions

Let me get this out of the way first because I see people asking about it constantly.

OpenClaw supports OAuth tokens from Claude Code, Gemini CLI, and OpenAI Codex. This means you can technically use your $20/month Claude Pro subscription or your Google AI subscription to power OpenClaw instead of paying for API credits. It works. But you’re playing with fire.

Why you can get banned

Anthropic, Google, and OpenAI all have terms of service that restrict how their subscription tokens can be used. When you route a Claude Code OAuth token through OpenClaw, here’s what happens:

Usage patterns change: Normal Claude Code use looks like a developer typing in a terminal. OpenClaw sends automated requests around the clock, often in bursts when scheduled jobs fire
Token volume spikes: An always-on assistant burns through more tokens than a human coding session
Rate limit abuse: OpenClaw may retry failed requests or send parallel calls that look like scraping
IP and fingerprinting: Requests from a VPS data center look different from a laptop on residential internet

What happens when you get banned

Platform	Consequence	Recovery
Claude Code	Account suspended, subscription cancelled	Appeal process exists but no guarantee
Gemini CLI	Google account flagged, API access revoked	Could affect other Google services
OpenAI Codex	Account banned, subscription terminated	Limited appeal options

Real Risk

People have had their accounts suspended within days of routing subscription tokens through automated tools. The detection is getting better, not worse. Your Claude Max subscription is not worth losing over $15 in saved API costs.

What to do instead

Use API keys. That’s it. Every provider offers pay-as-you-go API access that’s designed for automated usage:

Anthropic API: console.anthropic.com — get an API key, pay per token
Google AI: aistudio.google.com — Gemini API with generous free tier
OpenAI API: platform.openai.com — standard API access

Or better yet, use open source models where this isn’t even a concern. Which brings us to the actual recommendations.

1. GLM-5: The one I use for serious work

GLM-5 from Z.AI is the strongest open source model I’ve tested. It reasons well, barely hallucinates, and can actually handle multi-step agent tasks without falling apart halfway through.

Why GLM-5 works for OpenClaw

OpenClaw isn’t just a chatbot. It plans tasks, calls tools, runs scripts, and keeps context across long conversations. GLM-5 handles that kind of work reliably.

95.8% SWE-bench Verified: The highest coding score among open source models, which matters when OpenClaw runs code on your server
Near-zero hallucinations: Scores -1 on AA-Omniscience Index. When your assistant runs commands on a live server, you want it to be right
Agentic task handling: ELO 1,412 on GDPval-AA — only Claude Opus 4.6 and GPT-5.2 score higher
200K context window: Enough for OpenClaw’s conversation history plus tool outputs
MIT license: No usage restrictions, fully compatible with automated workflows

Technical specs

Feature	GLM-5
Total Parameters	744B (MoE)
Active Parameters	40B
Context Length	200K tokens
Architecture	MoE with Sparse Attention
Input Cost	$0.80/M tokens
Output Cost	$2.56/M tokens
License	MIT

GLM-5 Coding Plans

Setting up GLM-5 with OpenClaw

Edit your OpenClaw config to use GLM-5:

openclaw configure --section models

Or edit the config file directly:

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "z-ai/glm-5",
        "fallback": ["minimax/m2.5-lightning"]
      }
    }
  }
}

Then restart the gateway:

openclaw gateway restart

GLM-5 is available through OpenRouter and directly through the Z.AI API. For OpenClaw, I’d recommend using the Z.AI coding plans since they’re priced for developer workloads.

GLM Coding Plans

Z.AI offers GLM Coding Plans with pricing designed for developers running continuous workloads like OpenClaw.

Where GLM-5 shines in OpenClaw

Scheduled tasks: Morning briefings, server monitoring, cron jobs that need to actually work
Coding skills: When you tell OpenClaw to run scripts or manage Docker containers, GLM-5’s coding accuracy matters
Research tasks: Pairs well with DuckDuckGo search integration for pulling and summarizing web results
Multi-step workflows: Complex skills that chain multiple tool calls together
Long conversations: Keeps context across extended sessions without losing the thread

2. MiniMax M2.5: The cheap one that’s surprisingly good

MiniMax M2.5 costs a fraction of GLM-5 and still scores 80.2% on SWE-bench Verified. For an always-on assistant, that price difference matters a lot over a month.

Why MiniMax M2.5 works for OpenClaw

Most OpenClaw interactions don’t need the absolute best model. Quick questions, reminders, file management, simple research. A cheaper model handles these fine. MiniMax M2.5 is cheap enough that you forget it’s costing anything.

80.2% SWE-bench Verified: Beats Claude Opus 4.6 on Droid scaffold, more than enough for OpenClaw tasks
$0.15/M input tokens: The cheapest frontier model available. Running OpenClaw for a month might cost a few dollars
Two speed tiers: M2.5 at 50 tokens/second and Lightning at 100 tokens/second
10+ programming languages: Go, C, C++, TypeScript, Rust, Python, Java, and more
Agent framework support: Works with Claude Code, Droid, Cline, and similar agent tools

Technical specs

Feature	MiniMax M2.5
Architecture	Mixture-of-Experts (MoE)
Context Length	200K tokens
M2.5 Input Cost	$0.15/M tokens (50 TPS)
M2.5 Output Cost	$1.20/M tokens (50 TPS)
Lightning Input Cost	$0.30/M tokens (100 TPS)
Lightning Output Cost	$2.40/M tokens (100 TPS)

MiniMax Coding Plans (10% Off)

Setting up MiniMax M2.5 with OpenClaw

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "minimax/m2.5",
        "fallback": ["minimax/m2.5-lightning"]
      }
    }
  }
}

For faster responses (good for interactive chat), use the Lightning variant:

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "minimax/m2.5-lightning"
      }
    }
  }
}

Cost Breakdown

Running MiniMax M2.5 at 50 TPS continuously for an hour costs about $0.30. That means running OpenClaw 24/7 for a full month on MiniMax M2.5 would cost roughly $7-15 depending on actual usage. Compare that to $50-150/month with Claude API.

Where MiniMax M2.5 shines in OpenClaw

Always-on chat: Cheap enough that you don’t think twice about sending messages
Quick tasks: Reminders, file management, simple lookups
Coding assistance: 80.2% SWE-bench means it handles most coding skills reliably
Scheduled jobs: Morning briefings, server checks, cron tasks
Fallback model: Works well as a secondary model when your primary hits rate limits

GLM-5 vs MiniMax M2.5: head to head

Here’s how they compare for OpenClaw use:

Feature	GLM-5	MiniMax M2.5
SWE-bench Verified	95.8%	80.2%
Input Cost	$0.80/M tokens	$0.15/M tokens
Output Cost	$2.56/M tokens	$1.20/M tokens
Context Length	200K	200K
Hallucination Rate	Near-zero	Low
Speed	Standard	50-100 TPS
Best For	Complex tasks, coding, research	Always-on chat, budget usage
Monthly Cost (estimated)	$30-60	$7-15

What I actually run

Use both. Set GLM-5 as primary and MiniMax M2.5-Lightning as fallback:

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "z-ai/glm-5",
        "fallback": ["minimax/m2.5-lightning"]
      }
    }
  }
}

GLM-5 does the hard stuff, MiniMax catches anything that fails or gets rate-limited. For simpler skills and scheduled tasks, you can also point specific agents at MiniMax directly to save money.

Setting up DuckDuckGo search with your models

Both GLM-5 and MiniMax M2.5 work with OpenClaw’s web search. If you haven’t set that up yet, our DuckDuckGo OpenClaw search guide walks through it.

Web search matters more than you’d think for an always-on assistant. Without it, OpenClaw is limited to whatever the model learned during training. With it, OpenClaw can grab current information when answering questions or running research tasks.

Cost comparison: open source vs subscriptions

Here’s what you’d actually spend per month running OpenClaw with different approaches:

Approach	Monthly Cost	Ban Risk	Notes
Claude Code OAuth	$20-200 (subscription)	High	Terms violation, account suspension risk
Gemini CLI OAuth	$0-20 (subscription)	High	Could affect your Google account
Claude API (Sonnet 4.5)	$50-150	None	Expensive for 24/7 use
GLM-5 API	$30-60	None	Best performance, reasonable cost
MiniMax M2.5 API	$7-15	None	Best value, strong performance
GLM-5 + M2.5 combo	$20-40	None	What I run

The API route costs less than a Claude subscription and you don’t risk losing your account. For most OpenClaw setups, the GLM-5 and MiniMax M2.5 combo is the obvious pick.

Tips from running this setup

A few things I learned the hard way:

Prompt tuning

GLM-5 and MiniMax M2.5 respond differently to the same prompt. GLM-5 does better with clear, structured instructions. MiniMax M2.5 handles casual requests fine but needs explicit formatting instructions when you want structured output.

Edit your ~/.openclaw/workspace/SOUL.md to include model-specific guidance:

When executing tasks:
- Break complex requests into clear steps
- Confirm before running destructive commands
- Use structured output for research results

Context management

Both models support 200K tokens, but shorter conversations give better results. Use /compact when things get long, or /new to start fresh.

Monitoring costs

Watch your API usage for the first week. OpenClaw’s scheduled jobs and background tasks burn more tokens than you’d guess. Check your provider dashboards:

Z.AI: z.ai/dashboard
MiniMax: platform.minimax.io

Fallback configuration

Always set a fallback model. API providers go down. If GLM-5 is out at 3 AM and you have a morning briefing scheduled, MiniMax picks up the slack.

Frequently Asked Questions

Can I use both GLM-5 and MiniMax M2.5 at the same time?

Yes. Set one as primary and the other as fallback in your OpenClaw config. You can also configure different models for different agents or skills.

Will using my Claude subscription with OpenClaw definitely get me banned?

Not guaranteed, but the risk is real and growing. Anthropic actively monitors for automated usage through OAuth tokens. Several users have reported account suspensions. It’s not worth the risk when API access is available.

How much does it cost to run OpenClaw with MiniMax M2.5 for a month?

Depends on usage, but most people spend $7-15/month. Heavy users with lots of scheduled tasks and long conversations might hit $20-25. Still far cheaper than Claude API.

Can I switch models without restarting OpenClaw?

Yes. Use the /model command in your chat to switch models on the fly. For permanent changes, edit the config and run openclaw gateway restart.

Do these models support DuckDuckGo search in OpenClaw?

Yes. Both GLM-5 and MiniMax M2.5 work with OpenClaw’s DuckDuckGo search integration. The search tool provides web results that the model can use to answer questions with current information.

What about using local models with Ollama instead?

Local models are a solid option if you have the hardware. Our OpenClaw with Ollama guide covers hardware tiers, model recommendations (including Nanbeige4.1-3B for budget machines), and how to set up a hybrid config with local primary + API fallback. For most VPS setups, API-based models like GLM-5 and MiniMax M2.5 still perform better than what you can run locally.

At this point, there’s no good reason to risk your Claude or Gemini subscription on an automated tool. GLM-5 handles the complex stuff, MiniMax M2.5 keeps the bill low for everyday use. Set them up, configure your fallbacks, and stop worrying about ban emails.

Whichever model you use, make sure your OpenClaw instance is hardened first. The OpenClaw security guide covers CVE-2026-25253, 40+ patched vulnerabilities, and a step-by-step lockdown procedure.

For the full model breakdown including Kimi K2.5, Qwen-Max, and Devstral 2, check out our best open source LLMs for coding guide. If you’re still deciding between AI assistant platforms, our OpenClaw alternatives article covers the other options. For container-isolated agents with Claude, see our NanoClaw deploy guide. For the smallest possible Zig binary with 22+ providers, see the NullClaw deploy guide. And if you want visibility into what your agent is actually doing — sessions, costs, cron jobs — check out our best OpenClaw dashboards roundup.

Best Open Source Models for OpenClaw

Table of Contents

Subscription Risk Warning

Why Open Source Models Make Sense for OpenClaw

The Risks of Using Claude Code or Gemini CLI Subscriptions

Why you can get banned

What happens when you get banned

Real Risk

What to do instead

1. GLM-5: The one I use for serious work

Why GLM-5 works for OpenClaw

Technical specs

Setting up GLM-5 with OpenClaw

GLM Coding Plans

Where GLM-5 shines in OpenClaw

2. MiniMax M2.5: The cheap one that’s surprisingly good

Why MiniMax M2.5 works for OpenClaw

Technical specs

Setting up MiniMax M2.5 with OpenClaw

Cost Breakdown

Where MiniMax M2.5 shines in OpenClaw

GLM-5 vs MiniMax M2.5: head to head

What I actually run

Setting up DuckDuckGo search with your models

Cost comparison: open source vs subscriptions

Tips from running this setup

Prompt tuning

Context management

Monitoring costs

Fallback configuration

Fish Shell Syntax Highlighting Guide

Fish Shell Themes - Best Prompts (Tide, Starship, Pure)

Fish Shell vs Bash vs Zsh - Complete Comparison 2026

Table of Contents

Subscription Risk Warning

Why Open Source Models Make Sense for OpenClaw

The Risks of Using Claude Code or Gemini CLI Subscriptions

Why you can get banned

What happens when you get banned

Real Risk

What to do instead

1. GLM-5: The one I use for serious work

Why GLM-5 works for OpenClaw

Technical specs

Setting up GLM-5 with OpenClaw

GLM Coding Plans

Where GLM-5 shines in OpenClaw

2. MiniMax M2.5: The cheap one that’s surprisingly good

Why MiniMax M2.5 works for OpenClaw

Technical specs

Setting up MiniMax M2.5 with OpenClaw

Cost Breakdown

Where MiniMax M2.5 shines in OpenClaw

GLM-5 vs MiniMax M2.5: head to head

What I actually run

Setting up DuckDuckGo search with your models

Cost comparison: open source vs subscriptions

Tips from running this setup

Prompt tuning

Context management

Monitoring costs

Fallback configuration

Related Posts

Amp Code Free: The AI Coding Agent That Works in Your Editor

PicoClaw Setup Guide: Go Binary AI Assistant on $10 Hardware

NanoBot Setup Guide: MiniMax M2.5, GLM-5, and Brave Search on Your VPS