Skip to main content

OpenCastor Is an Agent Harness

19 min read By Craig Merry
OpenCastor Robotics AI Agents Architecture Technical AI Safety Autonomous Weapons Politics

Thariq Shihipar, an engineer on the Claude Code team at Anthropic, just published a thread called Lessons from Building Claude Code: Seeing Like an Agent.

It’s a detailed, honest breakdown of the design decisions behind one of the most interesting developer tools Anthropic has built. Worth reading in full.

The central thesis: the hardest part of building an agent harness is constructing its action space. Not the LLM. Not the prompt. The tools — the shape of the interface between what the model can reason about and what the world lets it do.

As I read through it, I kept recognizing things. Not because OpenCastor is like Claude Code — it isn’t, and the distinction matters — but because we hit the same class of design problems. Action space construction. Progressive disclosure. Elicitation across surfaces. Evolving tooling as model capability grows.

The solutions we arrived at are structurally similar. But they live in different consequence spaces, and that changes everything about how they’re built.


Both Are Agent Harnesses. That’s Where the Similarity Starts.

An agent harness is the layer between a language model and the world it acts in. It defines what the agent can do, what context it receives, how it communicates, and — critically — how it’s constrained.

Claude Code is a harness for software engineering tasks. Its primary execution environment is digital: files, code, shell processes, APIs, version control. The consequence of a wrong action is almost always recoverable. You revert the file. You re-run the command. You fix the bug. Even a bad rm can be undone if you catch it quickly enough. The execution environment is fundamentally sandboxed.

OpenCastor is a harness for embodied AI — robots that operate in the physical world. Its execution environment spans both digital (REST APIs, messaging channels, webhooks) and physical (motors, servo controllers, cameras, ultrasonic sensors, LiDAR). And here’s the part that changes everything: physical actions are frequently irreversible. You can’t un-tip a wheelchair. You can’t un-knock something off a shelf. A servo without torque limits will keep trying until it strips a gear.

This distinction — not “filesystem vs. robots,” but recoverable vs. irreversible consequences — cascades through every single design decision in OpenCastor. It’s why OpenCastor has a safety kernel. It’s why hardware commands route through a verified protocol layer before reaching actuators. It’s why the reactive layer exists at all.

With that framing in place, here’s how the architectures map.


1. The Action Space Problem → RCAN + ToolRegistry

Thariq’s math problem analogy is exactly right: the tools you give an agent should be shaped to its actual reasoning ability, not to what’s technically possible. Paper vs. calculator vs. computer — each is more powerful, but each requires more knowledge to use correctly.

In Claude Code, the tools are things like Read, Write, Bash, Grep — surgical, bounded, named to match how the model thinks about software tasks.

In OpenCastor, the action space is defined by two explicit layers that serve different purposes.

RCAN (Robot Communication and Addressing Network) is a messaging protocol — not just a YAML config. Every hardware command the model can issue goes through an RCAN message that carries a target RURI (the robot’s addressable identifier), a declared scope, a TTL, and a role check. The router validates each message against a 5-tier role hierarchy before anything reaches hardware:

GUEST   → Read-only status, no actuation
USER    → Basic teleoperation, chat
LEASEE  → Full control, config reads
OWNER   → Config writes, training, provider switching
CREATOR → Safety overrides, firmware, full access

A nav or teleop command requires at minimum LEASEE scope. STATUS reads can be done by GUEST. Safety messages skip the queue entirely — they’re treated as priority interrupts. This means the model literally cannot issue certain commands regardless of what it reasons — the protocol layer rejects them before they reach hardware.

ToolRegistry sits above RCAN. It’s the model-facing vocabulary layer — what functions the LLM can actually call by name:

reg.register("get_distance", lambda: ultrasonic.read_cm())
reg.register("take_snapshot", camera.capture_jpeg)
reg.register("announce_text", lambda text: tts.speak(text))

schema = reg.to_openai_tools()  # → OpenAI-format tool list for the LLM

The separation matters. RCAN is the hardware contract — a versioned, protocol-level specification of what the robot supports. ToolRegistry is the model vocabulary — the interface shaped to how the LLM actually reasons. They’re not the same thing. You can update the model-facing vocabulary without touching the hardware protocol. You can swap providers and the RCAN layer doesn’t care.

This is the same lesson Anthropic learned with Claude Code’s tool design: the right tool isn’t the most capable one. It’s the one the model calls correctly and reliably.


2. Progressive Disclosure → Tiered Brain

Thariq describes progressive disclosure as a way to add context incrementally — don’t dump everything in the system prompt, let the agent explore and find what it needs. Claude Code formalizes this with skill files that can reference other files recursively.

OpenCastor applies the same principle to decision routing, not just context.

Layer 0 / Reactive (<1ms): Pure rule engine. No LLM. Obstacle within 30cm? Stop. Camera frame is all zeros (null-padded sensor)? Skip to next tick. Emergency signal? Halt all actuators. This layer runs on every single control tick and cannot be bypassed by the model. It handles roughly 80% of all decisions at zero API cost.

Layer 1 / Fast Brain (~500ms): A vision-capable model (Gemini Flash, HuggingFace hosted, or local Ollama) processes the camera frame and produces a JSON action. It handles routine navigation and standard Q&A. This layer produces a Thought object with an attached action and a confidence score.

Layer 2 / Planner (~12s): Claude Opus or equivalent. Reserved for complex multi-step reasoning, novel situations, long-range planning, conversation. It fires in two cases: periodically (every N ticks, configurable via planner_interval, default 10), or when the fast brain signals uncertainty.

The escalation signal is explicit. If the fast brain’s confidence score falls below uncertainty_threshold (default 0.3), the harness escalates automatically:

if self.planner and not thought.action:
    # Fast brain produced nothing — escalate
    logger.info("Planner: escalation (fast brain produced no action)")

if thought.confidence < self.uncertainty_threshold:
    # Fast brain uncertain — escalate
    should_plan = True

The model at each layer only receives the context appropriate for its reasoning scope. The reactive layer never sees a prompt. The fast brain gets the current frame and a brief instruction. The planner gets accumulated context, prior actions, and a full strategic instruction. Each layer is scoped to what it needs — nothing more.

What’s interesting here is that this isn’t just about cost optimization (though that’s real: the planner costs orders of magnitude more per call than the fast brain). It’s about semantic appropriateness. You wouldn’t ask a Claude Opus to decide whether to stop for an obstacle — the question doesn’t warrant it, and the latency would make the robot unsafe. You also wouldn’t trust a fast classification model to plan a three-step navigation sequence through an unfamiliar environment.

Different decisions belong at different cognitive layers. The harness enforces that.


3. Capability Evolution → Sisyphus Loop

This is Thariq’s best observation, and the one most engineers will underweight:

“As model capabilities increase, the tools that your models once needed might now be constraining them.”

The example: Claude Code’s TodoWrite tool kept early Claude on track by injecting reminders every five turns. As models got better, those reminders became a constraint — Claude would rigidly follow the list instead of adapting to new information. They replaced TodoWrite with the Task Tool, designed for multi-agent coordination rather than model babysitting.

This is a genuine design problem. The scaffolding you build for a weaker model becomes technical debt when the model improves. The harness has to evolve.

OpenCastor’s answer is to automate that evolution. After each episode, the Sisyphus Loop runs:

Episode → PM Agent (Analyze) → Dev Agent (Patch) → QA Agent (Verify) → Apply

The PM agent reviews what happened: what failed, what succeeded, what took longer than expected, what safety rules fired. The Dev agent generates config patches — not code rewrites, but targeted changes to prompts, thresholds, and behavior rules. The QA agent runs the patch against a simulation harness. If it passes, the patch applies — either automatically, or queued for human review depending on your approval settings.

The ALMA consolidation layer does this at the behavioral level: scoring action sequences across episodes, identifying which patterns actually work in which environments, promoting high-confidence patterns into the knowledge base, and — eventually — pruning stale ones.

This doesn’t fully solve the problem Thariq describes. You still need human judgment about when a capability has outgrown its scaffolding. But it closes the feedback loop significantly. Instead of manually reviewing outputs to notice that the planner fires too often on simple tasks, the system tells you — and suggests a new planner_interval or a lower uncertainty_threshold.


4. Elicitation → Channel Layer + Surface-Aware Prompting

The AskUserQuestion story in the thread is one of the most instructive parts.

Three attempts:

  1. Add a questions parameter to the ExitPlanTool — confused Claude, ambiguous state
  2. Ask for structured markdown output and parse it — Claude would freestyle, add sentences, break the format
  3. A dedicated tool that blocked the agent loop until the user answered — reliable, structured, Claude liked calling it

The core insight: surface matters. A free-text question and a modal with selectable options carry the same information but are completely different interaction patterns. The tool has to match the surface it’s operating on.

OpenCastor has a harder version of this problem. The same brain — same model, same config, same robot — can receive commands from WhatsApp, Telegram, Discord, a voice interface, and a terminal REPL simultaneously. Each surface has completely different conventions, different user expectations, and different formatting rules.

A user on WhatsApp sending “go to the kitchen” shouldn’t see:

{"type": "move", "linear_x": 0.5, "angular_z": 0.0, "reason": "navigating to kitchen"}

That’s the internal action format the brain produces. It’s correct. It’s also terrible UX for a messaging surface.

The solution is build_messaging_prompt() — a surface-aware pre-prompt injected into every inference call. The function takes surface, hardware, capabilities, and sensor_snapshot as parameters and builds a different communication contract for each:

surface_note = {
    "whatsapp":  "You are messaging via WhatsApp. Natural language only. No JSON.",
    "voice":     "You are speaking aloud. Short sentences. No lists.",
    "terminal":  "You are in a terminal. Verbose output is fine.",
    "dashboard": "You are in a monitoring UI. Structured output preferred.",
}.get(surface, "You are communicating with a human operator.")

The same brain, different surface contract. It also injects live hardware status — if the motor driver is offline, the model is told explicitly (“never pretend hardware works if it’s in mock mode”). If the OAK-D depth camera is available, the model’s available command vocabulary expands to include spatial navigation commands.

Elicitation here isn’t just about asking good questions. It’s about building the right communication model for each surface — and gating capabilities to what the hardware can actually deliver at that moment.


5. Swarm Coordination → OrchestratorAgent + SharedState Intent Queue

Thariq describes the transition from TodoWrite (keeping one model on track) to the Task Tool (coordinating multiple subagents). Tasks have cross-agent visibility, declared dependencies, and shared state. They enable real multi-agent workflows.

OpenCastor has this natively, because a physical robot has genuinely parallel concerns that can’t be serialized into a single reasoning thread.

The architecture:

TieredBrain.think()
    └── OrchestratorAgent.sync_think(sensor_data)
            ├── GuardianAgent        (safety veto)
            ├── NavigatorAgent       (where to go)
            ├── ObserverAgent        (what to see)
            ├── ManipulatorAgent     (what to touch)
            └── CommunicatorAgent   (what was said)

Each specialist agent publishes an Intent to SharedState — a thread-safe coordination bus with a priority queue and preemption logic:

@dataclass
class Intent:
    owner: str        # "navigator", "guardian", "communicator"...
    goal: str         # human-readable description
    priority: int     # 1 (low) → 5 (high)
    state: str        # "queued" | "active" | "paused" | "done"
    intent_id: str

No agent commands hardware directly. They declare intent. The OrchestratorAgent reads all published intents, resolves conflicts, and issues the single RCAN action that actually reaches the driver layer.

Guardian holds hard veto power. If GuardianAgent has set swarm.estop_active = True, the orchestrator will not execute a movement intent regardless of NavigatorAgent’s priority score. The veto isn’t a model judgment call — it’s a logic check in the orchestration layer before the RCAN message is even built.

This is more than the Task Tool pattern. It’s closer to a priority arbitration system with safety overrides. The distinction matters because in a physical environment, you can have genuinely conflicting valid goals: the user wants to move forward, but there’s an obstacle. The model shouldn’t reason its way through that conflict — the architecture should resolve it deterministically.


6. Context Building → Episode Search + Memory

Thariq traces Claude Code’s evolution from RAG (context was handed to Claude) to Grep (Claude finds its own context). The insight: given good search tools, Claude builds more relevant context than any static injection strategy.

OpenCastor has episode_search.py — a semantic search layer over the robot’s episodic memory. Each completed episode is stored with its action sequence, sensor readings, outcomes, and safety events. When the planner fires on a new task, it can query similar past episodes:

  • “What did I do the last time I was in a cluttered room?”
  • “What happened the last time this sensor pattern appeared?”
  • “Which navigation approaches succeeded on this type of floor surface?”

Rather than a fixed context window curated by the harness, the planner builds its own context from relevant past behavior. This compounds: a robot that’s navigated a specific home environment hundreds of times has richer episode memory to draw on than one making its first run.

The ALMA consolidation layer — which I’ve written about separately — sits above this and promotes recurring successful patterns from episodic memory into the permanent knowledge base. Individual episodes are raw data. Consolidated patterns are generalizations. The distinction matters for how the planner uses each.


The Design Constraint Claude Code Doesn’t Have

Every parallel above is structurally real. But there’s one design constraint OpenCastor carries that Claude Code doesn’t need to worry about at all.

Irreversibility.

When Claude Code writes a bad file, you revert it. When it runs a bad shell command, you assess the damage and fix it. Even the worst case — deleting something important — is recoverable with backups. The execution environment has natural checkpoints and undo semantics built into every layer.

Physical actions don’t have undo. A robot that drives into someone doesn’t get to revert that commit. A servo that over-torques a joint can damage hardware that costs real money to replace. A wheelchair that accelerates unexpectedly in a medical setting is not a UX problem.

This is why OpenCastor has a safety architecture that operates independently of the model, independent of the harness logic, and independent of whatever the user asked for:

  • Hardcoded reactive rules that fire before any LLM call — obstacle too close, stop, regardless of instruction
  • RCAN scope enforcement that rejects commands from principals that lack the required role — the model literally cannot issue certain commands
  • Physical safety bounds declared in the RCAN YAML — the driver layer clips values at hardware limits; the LLM output is advisory, the bounds are enforced
  • Audit chain on every RCAN message — full forensic reconstruction of what was commanded, by what principal, at what time, and what the hardware response was

The article’s framing — design tools shaped to the model’s abilities — is right. But for embodied AI, you have to add a second constraint: design tools shaped to the consequences of the model being wrong.

A software agent that hallucinates produces bad output.

An embodied agent that hallucinates moves.

That’s the difference, and it’s why the safety kernel isn’t a feature in OpenCastor. It’s load-bearing architecture.


Why This Matters Now: The Department of War

I want to be direct about something, because it’s the reason I’ve been building OpenCastor with the architecture I have.

This week, Anthropic’s CEO Dario Amodei published a public statement refusing a Pentagon ultimatum. The Department of War — that’s the official name now — demanded Anthropic remove safety guardrails from Claude and allow unrestricted military use of the model, covering all “lawful purposes.” Amodei refused. His two red lines: mass domestic surveillance and fully autonomous weapons — physical systems that select and engage targets without a human in the loop.

Hegseth’s response was to threaten to cancel Anthropic’s contracts and blacklist the company from military work. The White House followed by banning Anthropic from government systems entirely.

Amodei’s argument is the same argument OpenCastor’s architecture makes at the code level: today’s frontier AI systems are simply not reliable enough to power fully autonomous weapons without human oversight. His exact words: “cannot be relied upon.” The hallucination problem, the confidence calibration problem, the distribution shift problem — these aren’t theoretical. They’re why you have a reactive layer that fires before the LLM. They’re why the planner escalates when confidence drops below 0.3. They’re why Guardian holds a veto the orchestrator cannot override.

But here’s the part that keeps me up at night.

The Pentagon’s position is that legality is their responsibility as the end user, not Anthropic’s. “It’s not up to a contractor to make decisions about how its technology is used.” Their premise: the model is a tool, and tools don’t get to have red lines.

If that argument wins — and it may, given the political pressure being applied — then the safety constraints enforced at the AI company level disappear. Model providers get compelled to offer unrestricted APIs. The only remaining safety layer is whatever the runtime puts between the model output and the physical actuators.

That’s the layer OpenCastor is building.

When a government can compel an AI company to remove its safety constraints, the question of who controls the physical consequence layer of an embodied AI system stops being a theoretical architecture question. It becomes a political one. And the answer, right now, is: whoever controls the runtime.

OpenCastor’s safety kernel — hardcoded e-stop rules, RCAN scope enforcement, physical bounds that clip actuator commands regardless of model output, anti-subversion logic that prevents the AI from disabling its own constraints — isn’t designed for the world where AI companies are free to maintain their own red lines. It’s designed for the world we may actually be entering: one where those red lines are under pressure from the top.

The reactive layer fires before the LLM call. The RCAN router rejects commands that lack the right role scope. The driver layer clips values at physical safety bounds regardless of what the planner reasoned. None of these constraints depend on what the model provider allows. None of them can be removed by changing an API key.

I’m not naive about what this means. An open-source robotics runtime isn’t going to stop a state actor. But the architecture matters as a statement of values and as a template. If we build embodied AI systems where the safety kernel is transparent, auditable, and independent of any single provider’s terms of service, we at least create a baseline that’s harder to quietly remove than a checkbox in a contract.

Amodei put it plainly: AI that takes humans entirely out of the loop on lethal decisions isn’t something today’s technology can safely do. The failure modes are real and documented. Putting a hard stop between model output and physical consequence isn’t conservatism — it’s engineering honesty.

That stop is what the reactive layer is for.


The Framing That Actually Fits

Both Claude Code and OpenCastor are agent harnesses that solve the same class of problems: action space construction, progressive disclosure, elicitation across surfaces, multi-agent coordination, and evolving tooling as model capability grows.

The difference isn’t what they operate on. Both span digital and physical interfaces. Both have tool registries, agent layers, memory systems, and communication channels.

The difference is the consequence structure of the environment they operate in. Claude Code’s consequence structure is digital and mostly reversible. OpenCastor’s consequence structure includes the physical world — and physical mistakes don’t have a revert button.

That’s what makes embodied AI harness design harder, not different in kind, but harder in degree. You apply all the same principles. Then you add a safety kernel, because the model will eventually be wrong, and being wrong in a physical environment is a different category of problem than being wrong in a text editor.

Thariq is right: designing your agent’s tools is an art, not a science. You experiment, you read outputs, you see like the agent.

For robots, you do all of that — and then you also make sure it can’t drive off the table.



How side projects like this stay alive

A friend once asked me how I keep building things outside of work without burning out. The honest answer: I treat side projects like modular services, not monoliths. Small experiments. Clear stopping points. No obligation to ship. No guilt if abandoned. The goal isn’t completion — it’s exploration.

The more important factor is emotional risk diversification. If your entire sense of growth lives inside your job title, you’re emotionally leveraged. When your work identity and creative identity are fused into one pipeline, any slowdown feels existential. Side projects create independent progress, independent learning, independent momentum. If work is stable but slow, I’m still growing. If work is intense, I can dial projects down. That diversification is psychological stability.

The filter I use for choosing what to build: tension. Either “I wish this existed,” or “what would it look like if…?” OpenCastor started as the second kind — what would it look like if robotics networking started from safety requirements rather than retrofitting them later? That question still has more surface area than I’ve explored. Which is how a project stays alive past the initial shipping rush.


OpenCastor is open source. Install in 10 seconds: curl -sL opencastor.com/install | bash. Source on GitHub.