Local CLI, a Hermes Wrapper, or OpenClaw? The Paperclip Adapter Decision Nobody Helps You Make

The adapter is the most consequential Paperclip setting and the least discussed. It decides how much machinery sits between your agent and the model. I wired my fleet all three ways — a bare Copilot CLI, a Hermes kernel wrapping it, and an OpenClaw gateway — and one of them quietly broke and started leaning on another. Here's the honest trade-off, and how to choose.

There’s a poll in the Paperclip discussions that asks the only adapter question most people think to ask: which one gives the best balance of cost, quality, reliability, and speed? It’s the wrong axis. The adapter isn’t really a choice between models — it’s a choice about how much machinery sits between your agent and the model. And that decision quietly determines how your whole fleet fails.

This is Part III of the Paperclip series. Part I was the wiring; Part II was goals and routines. This one is the layer underneath both: the adapter — the thing that turns a heartbeat into a model call. I run my fleet three different ways — a bare GitHub Copilot CLI, a Hermes kernel wrapping that CLI, and an OpenClaw gateway in front of the whole thing — and I’m going to tell you what each one actually costs, including the part where one of them broke and started leaning on another.

The mental model: the adapter is a brain stem

Every Paperclip heartbeat has to reach a model somehow. The adapter is that path — the brain stem between the company’s nervous system (issues, goals, routines) and the actual intelligence. Paperclip ships a stack of them: claude_local, codex_local, opencode_local, hermes_local, openclaw_gateway, a generic process, an http endpoint, and more.

Strip away the names and there are really three postures, each adding one more layer between agent and model:

Posture 1 — BARE CLI        agent → local CLI → model
Posture 2 — WRAPPED CLI     agent → kernel (Hermes) → CLI → model
Posture 3 — GATEWAY         agent → gateway (OpenClaw) → … → model

Every layer you add buys you something and costs you something. The entire decision is: which layers earn their keep for you.

One thing to be clear about before we start, because the numbering invites it: these postures are ordered by how much machinery sits between agent and model — not by quality. A lower number is not a lesser choice. Posture 1 is the one I run most, and the one most people should reach for first; “higher in the stack” means more moving parts, not more value. With that straight, let me take them one at a time, honestly.

Paperclip's "Add a new agent" dialog, which frames the same choice: configure a local runtime yourself, or invite an external agent — "OpenClaw, Hermes, or any agent that can call the invite API." The adapter you pick here is the brain stem for everything that agent does.

Posture 1 — The bare local CLI (start here)

The simplest path: point each agent straight at a CLI that’s already installed and authenticated on the host. For me that CLI is the GitHub Copilot CLI, and — this is the part Part II under-sold — I don’t run it one-shot. I run it in ACP mode (Agent Client Protocol), kept persistent, so the agent holds a live session across heartbeats instead of cold-starting every time:

{
  "adapterType": "acpx_local",
  "adapterConfig": {
    "agentCommand": "copilot --acp",   // Copilot CLI speaking ACP
    "mode": "persistent",               // hold the session across heartbeats
    "permissionMode": "approve-all",    // non-interactive: never block on prompts
    "cwd": "/path/to/the/repo",         // the repo this agent works in
    "timeoutSec": 1800                  // circuit breaker per heartbeat
  }
}

(If your CLI doesn’t speak ACP, the generic process adapter running copilot -p "{{prompt}}" is the one-shot equivalent — simpler, but it pays a cold start every heartbeat.)

One of my agents' Configuration tab: adapter type set to ACPX (local), the working directory pointed at its repo, and the persona's runtime line spelling it out — "Runtime: GitHub Copilot CLI on claude-opus-4.8." This is Posture 1 in the flesh: the agent talks straight to a local CLI, nothing in between.

What this posture buys you: the fewest moving parts in existence. There is no extra daemon, no gateway, nothing to monitor except the CLI itself. Latency is as low as it gets, and persistent ACP gives you session continuity for free. One auth — whatever you logged the CLI into — and you’re running.

What it costs you: the model is whatever that CLI is logged into, full stop. There’s no shared coordination across agents — each one is an island. And that island problem bites at scale exactly the way one operator described running 37 agents: when you want to move the fleet from one backend to another, you’re toggling every agent by hand, because the adapter config lives per-agent. They literally built an internal agent whose only job was to flip the other agents’ adapters. That’s the smell that tells you you’ve outgrown Posture 1.

Choose it when: you’re a single operator on one machine, or you’re just starting. This is where everyone should begin, and where many people should stay.

Posture 2 — Wrap the CLI in a kernel (Hermes)

The next layer routes every agent through one always-on kernel that owns the model session. In my rig that kernel is Hermes; Paperclip talks to it through a fleet adapter (the built-in equivalent is hermes_local, which speaks to a local Hermes via its adapter package). The agents stop each holding their own CLI login — they all defer to the kernel, which wraps a single Copilot-backed model and hands work back.

What this layer buys you — and it’s a lot:

One switch point for the whole fleet. Change the model, the auth, or the backend once, on the kernel, and all 27 agents follow. This is the direct cure for the 37-agents-toggled-by-hand pain — the thing people are asking for when they request a “server default adapter.”
Cooperative rate-limiting and account rotation, centralised. The community keeps rebuilding this per-tool — there’s a whole router for rotating between multiple Codex accounts that skips ones that hit their limit. A kernel does it once, for every agent, instead of N times.
Shared memory, delegation, and scheduling. Because every call flows through one place, the kernel can give agents durable memory tiers, let them delegate to sub-agents, and run cron — none of which a bare CLI can coordinate.

What it costs you: an always-on component you now have to keep alive and watch. If the kernel is down, the whole fleet stalls — you’ve traded N independent islands for one shared single point of failure. That’s a real trade, not a free upgrade. It’s only worth it once the coordination is worth more than the fragility.

Choose it when: you run many agents, you want one control plane for model/auth/spend, and you already operate a kernel that does memory and delegation. (This is the layer I lean on most — it’s the same kernel from my AI operating system.)

Posture 3 — Put a gateway in front (OpenClaw)

The third layer connects Paperclip agents to an OpenClaw gateway via the openclaw_gateway adapter. The draw is the surface: OpenClaw brings multi-channel chat (Slack, WhatsApp), a fleet of named persona agents, and — if you’re me — a 3D office you can watch your agents work in. As an experience, it’s the richest of the three.

What it costs you — and I have to be blunt, because the discussions are: this is the most fragile path by a wide margin. Connecting OpenClaw is the thread where someone spent an entire day on a hack because the gateway card was hard-coded as comingSoon and greyed out in the served UI bundle. It’s the thread where the adapter throws invalid agent params: unexpected property 'paperclip', the fix sits merged-but-unreleased, and frustrated users are writing their own bridge adapters to route around it. When a capability needs a community-maintained bridge to function, that tells you exactly where it sits on the reliability curve.

And here’s the honest kicker from my own setup: OpenClaw’s direct Copilot auth expired, so my OpenClaw fleet now routes its model calls through Hermes anyway. I have a gateway in front of a wrapper in front of a CLI — three layers deep — and the top layer can’t reach the model on its own. It works, the office is genuinely useful, but I’d be lying if I called it lean.

Choose it when: you specifically want OpenClaw’s channels, personas, or office as a product surface, and you accept the wiring tax to get them. Do not reach for it just to “run agents” — Postures 1 and 2 do that with far less to break.

The decision, on one screen

	Bare CLI	Hermes wrapper	OpenClaw gateway
Moving parts	Fewest	+1 kernel	+1 gateway (+kernel)
Latency	Lowest	Low	Highest
Fleet-wide model/auth switch	❌ per-agent	✅ one place	✅ via kernel
Shared memory / delegation / cron	❌	✅	✅
Rate-limit / multi-account coordination	❌	✅ central	✅ via kernel
Multi-channel chat / office	❌	❌	✅
Fragility / wiring tax	Lowest	Medium	Highest
Best for	One operator, start here	Many agents, one control plane	You want the OpenClaw surface

The pattern is almost monotonic: each layer buys coordination and reach, and charges you in moving parts and fragility. There is no “best adapter” — there’s only the lowest layer that does what you actually need.

What I actually run, and why

To be concrete about my own fleet, because abstract advice is cheap:

The workhorse is Posture 1 — the Copilot CLI in persistent ACP mode (acpx_local). Most agents, most of the time, talk to the model through that, because it’s the least that can break.
Hermes sits underneath as the kernel for the agents that need shared memory, delegation, and cron — and as the single place I rotate auth and switch models. That layer earns its keep daily.
OpenClaw is only there for its surface — the office and the channels — and, as confessed, it leans on Hermes for the model. I keep it because I use the office, not because it’s load-bearing for execution.

The layering is deliberate, but the rule behind it is strict: a layer that doesn’t earn its keep gets removed. OpenClaw is on probation precisely because it’s the heaviest layer doing the least execution work.

What I’d tell you to steal

Start bare. Point your agents at one local CLI you’ve already authenticated (for me, copilot --acp in persistent mode). Ship something before you add infrastructure.
Add a kernel when you feel the toggle. The moment you’re editing more than a handful of agents by hand to switch model, auth, or backend — that’s the 37-agents signal. A Hermes-style wrapper turns N edits into one, and gives you shared memory and rate-limit coordination as a bonus.
Add a gateway only for its surface. Reach for OpenClaw when you specifically want channels or an office — and go in knowing the wiring is the most fragile part of the stack, not a casual upgrade.
Make every layer earn its keep. The number of things between your agent and the model is a liability you pay for on every heartbeat and every outage. Add a layer only when the coordination it buys is worth more than the failure mode it introduces — and rip it out the day that stops being true.

This closes the Paperclip configuration trilogy: the wiring, the goals and routines that keep it shipping, and the adapter layer that decides how it reaches a model. If there’s a Part IV, it’ll be about what breaks once all of this is running in anger.