Skip to content
Chimera readability score 0.5213 out of 100, reading level.

Liberate your OpenClaw 🦀

If you've been cut off and your OpenClaw, Pi, or Open Code agents need resuscitation, you can move them to open models in two ways:

  • Use an open model served through Hugging Face Inference Providers.
  • Run a fully local open model on your own hardware.

The hosted route is the fastest way back to a capable agent. The local route is the right fit if you want privacy, zero API costs, and full control.

To do so, just tell your claude code, your cursor or your favorite agent: help me move my OpenClaw agents to Hugging Face models, and link this page.

Hugging Face Inference Providers

Hugging Face inference providers is an open platform that routes to providers of open source models. It’s the right choice if you want the best models or you don’t have the necessary hardware.

First, you’ll need to create a token here. Then you can add that token to openclaw

like so:

openclaw onboard --auth-choice huggingface-api-key

Paste your Hugging Face token when prompted, and you’ll be asked to select a model.

We’d recommend GLM-5 because of its excellent Terminal Bench scores, but there are thousands to chose from here.

You can update your Hugging Face model at any time entering its repo_id

in the OpenClaw config:

{

agents: {

defaults: {

model: {

primary: "huggingface/zai-org/GLM-5:fastest"

}

}

}

}

Note: HF PRO subscribers get $2 free credits each month which applies to Inference Providers usage, learn more here.

Local Setup

Running models locally gives you full privacy, zero API costs, and the ability to experiment without rate limits.

Install Llama.cpp, a fully open source library for low resource inference.

on mac or linux

brew install llama.cpp

on windows

winget install llama.cpp

Start a local server with a built-in web UI:

llama-server -hf unsloth/Qwen3.5-35B-A3B-GGUF:UD-Q4_K_XL

Here, we’re using Qwen3.5-35B-A3B, which works great with 32GB of RAM. If you have different requirements, please check out the hardware compatibility for the model you're interested in. There are thousands to choose from.

If you load the GGUF in llama.cpp, use an OpenClaw config like this:

openclaw onboard --non-interactive \

--auth-choice custom-api-key \

--custom-base-url "http://127.0.0.1:8080/v1" \

--custom-model-id "unsloth-qwen3.5-35b-a3b-gguf" \

--custom-api-key "llama.cpp" \

--secret-input-mode plaintext \

--custom-compatibility openai

Verify the server is running and the model is loaded:

curl http://127.0.0.1:8080/v1/models

Which path should you choose?

Use Hugging Face Inference Providers if you want the quickest path back to a capable OpenClaw agent. Use llama.cpp

if you want privacy, full local control, and no API bill.

Either way, you do not need a closed hosted model to get OpenClaw back on its feet!

Facts Only

OpenClaw, Pi, and Open Code agents can be moved to open models via Hugging Face Inference Providers or local hardware.
Hugging Face Inference Providers require creating a token and adding it to OpenClaw using the command `openclaw onboard --auth-choice huggingface-api-key`.
GLM-5 is recommended as a model due to its Terminal Bench scores.
Hugging Face PRO subscribers receive $2 in free credits monthly for Inference Providers.
Local setup involves installing llama.cpp via `brew install llama.cpp` (mac/linux) or `winget install llama.cpp` (Windows).
A local server can be started with `llama-server -hf unsloth/Qwen3.5-35B-A3B-GGUF:UD-Q4KXL`.
Qwen3.5-35B-A3B requires 32GB of RAM for optimal performance.
OpenClaw can be configured to use a local model with the command `openclaw onboard --non-interactive --auth-choice custom-api-key --custom-base-url "http://127.0.0.1:8080/v1" --custom-model-id "unsloth-qwen3.5-35b-a3b-gguf" --custom-api-key "llama.cpp" --secret-input-mode plaintext --custom-compatibility openai`.
The local server status can be verified with `curl http://127.0.0.1:8080/v1/models`
The hosted route prioritizes speed and capability, while the local route emphasizes privacy and cost control.
Both methods eliminate dependency on closed hosted models.

Executive Summary

OpenClaw agents, Pi, or Open Code agents can be migrated to open models through two primary methods: using Hugging Face Inference Providers or running models locally via llama.cpp. The hosted option offers quick access to capable agents, while the local approach provides privacy, cost savings, and full control. For Hugging Face, users create a token, onboard it via OpenClaw, and select a model like GLM-5, which is recommended for its performance. Local setup involves installing llama.cpp, launching a server with a model like Qwen3.5-35B-A3B, and configuring OpenClaw to connect to the local endpoint. The choice depends on priorities: speed and convenience with Hugging Face, or autonomy and privacy with local deployment. Both methods avoid reliance on closed hosted models, emphasizing open-source alternatives.
The article presents clear technical steps for each path, acknowledging trade-offs like hardware requirements for local models or API costs for hosted services. It also notes that Hugging Face PRO subscribers receive monthly credits, reducing costs for inference usage. The guidance is practical, targeting users who may have lost access to their agents and need a straightforward recovery process. The tone is instructional, focusing on empowering users to regain functionality without proprietary constraints.

Full Take

This narrative presents a clear, actionable path for users to regain control of their AI agents by migrating to open-source alternatives. The strongest version of this argument is its emphasis on user autonomy—whether through hosted services for convenience or local deployment for privacy. It avoids hyperbole, focusing on practical steps and trade-offs, which aligns with principled advocacy for open systems.
Pattern scan: The framing avoids manipulation tactics, though it subtly contrasts "closed hosted models" with open alternatives, which could imply a binary choice (ARC-0043 Motte-and-Bailey if taken to extremes). However, the article remains grounded in technical guidance rather than ideological rhetoric. No emotional exploitation or distortion is detected.
Root cause: The underlying paradigm is the tension between centralized control (proprietary models) and decentralized agency (open-source tools). The unstated assumption is that users *should* prefer open models, but the article stops short of coercion, instead offering options.
Implications: For human agency, this empowers users to bypass gatekeepers, but local deployment requires technical literacy and hardware, potentially excluding less-resourced individuals. Second-order consequences include reduced reliance on corporate APIs, which could shift power dynamics in AI development.
Bridge questions: What are the long-term sustainability risks of relying on hosted open models? How might hardware limitations affect equitable access to local AI? Would a hybrid approach (e.g., federated models) better balance convenience and autonomy?
Counterstrike scan: A bad actor might weaponize this narrative to undermine trust in proprietary models without offering viable alternatives, but this article provides concrete solutions. The content aligns with genuine user empowerment, not manipulation.
Patterns detected: ARC-0043 Motte-and-Bailey (minor, in the open/closed framing)