Facts a Third Party Can Check: A Verifiable Evidence Layer for AI That Acts

Last week a government, acting on a phone call, ordered Anthropic to pull its two most capable models — Fable 5 and Mythos 5. No published finding. A verbal notice of a “narrow jailbreak.” As far as I can determine, and I have looked, this is the first time any state has ordered a frontier lab to take a large language model offline — the first export-style ban on access to an LLM. I do not think the goal was wrong. I think the process was, and that is the fixable part, the part worth building against before its shape hardens into precedent. Precedents are what compound: once set, they decide cases nobody had imagined when they were set.

That was my first reaction, written the day it happened. I have spent the last few days since trying to argue myself out of it, and I cannot. Here is the long version, because the short version keeps being read as a complaint about a decision, and it is not. The distinction between objecting to the decision and objecting to the mechanism behind it is the entire essay. It is an argument about what makes a decision answerable at all — and about a window, now closing, in which we still get to choose whether the machines we are building to act in the world will leave a record that anyone but their operators can read.

The Question Underneath the Recall

Blocking an unsafe deployment is a thing a government is supposed to be able to do. If a model can be reliably steered into producing something genuinely dangerous, and the people running it cannot or will not close the hole, there is a public interest in stopping the deployment until it is fixed. It is the same logic as grounding an aircraft type, pulling a drug, recalling a car. Nobody serious argues the state should have no authority here, and I do not.

So this is not a free-speech argument and not a leave-the-labs-alone argument. The objection is narrower and, I think, more durable: it is to the mechanism. And underneath the mechanism is the oldest question there is about power, which is not who holds it but how the rest of us know whether it has been used well.

Take the recall apart and look at what was not in the box. No published trigger — nobody can point to a written rule and say this model crossed this line. A narrow jailbreak is a category, not a criterion. No written basis: a verbal notice cannot be examined by anyone who was not on the call. No due process — no stated standard to meet, no record of the evidence, no defined path to contest the finding or to demonstrate the hole was closed. The decision and its explanation were the same event, and that event left no artifact behind.

For a deaf engineer the verbal part lands with particular weight. I have spent my life insisting that the load-bearing parts of a conversation be written down, because the spoken version is the one that vanishes and the one I cannot independently verify. The written record is the great equalizer — it lets a person who was not present, or could not hear, hold the same facts in their hands as everyone else. A regulatory action that exists only as speech exists only for the people in the room; it is, in the most literal sense, inaudible to the rest of us.

None of those three absences is about whether the call was right. The model may well have carried a real and serious flaw; I have no way to know, and that is the point. When the basis is unwritten, was the government right becomes unanswerable for everyone outside the decision. You are left trusting the operator of the process — the one thing we have learned, over centuries and at great cost, not to do anywhere else that matters.

That long lesson is worth stating once, in full, because the rest of this essay applies it to a new kind of actor. Every durable advance in the legitimacy of power has taken the same form: forcing the powerful to make a record — to commit, in advance and in writing, to the rule they will be judged by, and to leave behind an account someone other than themselves can read. The written law replaced the remembered custom only the elders could recite. The trial replaced the ordeal whose outcome only the administering authority could interpret. The public ledger replaced the private word of the merchant about what was owed. The published standard replaced the guild secret. In each case the move is identical, and it is not toward kindness. It is a move away from trust, toward verifiability — the thing you reach for precisely because trust has proven insufficient. Legitimacy, in the only sense that survives contact with self-interest, is not the feeling that power is being used well. It is the standing ability of the governed to check.

This is what I mean when I say the recall was the wrong process, not the wrong goal. The goal — stop the dangerous deployment — is one we share. The process reached back past every one of those hard-won advances and decided the case the old way: by the unrecorded word of the authority, examinable by no one outside the room. That is not a small procedural lapse. It is a reversion, in a domain that cannot afford one, to the form of power accountable governance was built to replace.

Here is the detail that turned a bad week into something I had to write. The recall came ten days after Executive Order 14409 established a voluntary, no-preclearance framework for frontier models — a deliberate choice not to make labs get permission before they ship. The ink was still wet on a policy that said we will not gate you up front, and then a model was gated, by phone, with no written basis at all.

A Voluntary Policy and an Unwritten Power

You cannot operate a voluntary, no-preclearance framework and an unwritten recall power at the same time and call the combination a system. The space between those two postures is where everyone who builds is made to stand. When the rule you will be judged against does not exist in writing until after you have been judged, you are not operating under a policy. You are operating under a mood — and a mood cannot be anticipated, contested, or checked. It can only be suffered.

Anthropic’s own framing of what should replace this is, I think, exactly right: a government should be able to block an unsafe deployment, but through a statutory process that is transparent, fair, clear, and grounded in technical facts. Not one of those four words is about values. Every one is about process and evidence, which dissolves the argument everyone expects to have. The values are not in dispute — the agency, Anthropic, the downstream user, me, all agree dangerous deployments should be stoppable. What is missing is the evidence layer: the common, checkable set of facts a recall could point to, that a lab could have anticipated, that a third party could later examine to judge whether the call was sound. A society that agrees on its ends but has no common instrument for checking its means has only postponed the moment when the disagreement becomes unanswerable.

There is a second-order harm here, worse than the unpredictability. Anthropic had publicly disclosed that perfect jailbreak resistance is impossible — a statement true of every model from every lab, including the small offline ones I run on my own hardware, and exactly the candor a trust-based framework is supposed to reward. And that candor appears to be what framed the concern that pulled the models. The disclosure did not cause the vulnerability; it described a property all such systems share. But it was the visible thing, so it became the cited thing.

Stay with the structure rather than guessing at intentions, because the structure governs systems over time regardless of who operates it. A regime in which describing a true limitation can be cited as grounds for a recall, while saying nothing cannot, runs its incentives against disclosure and selects, over many actors and many years, for less of what it ought to want most. And that is the ethical core: opacity is not a neutral default that transparency improves upon. Opacity is a choice with victims — the people downstream who must trust what they cannot inspect — and an enforcement regime with no published basis does not merely fail to discourage that choice. It rewards it, slowly, at everyone’s expense, for as long as the architecture stands.

So here is where the diagnosis turns buildable. What makes governance fair is not that the people enforcing it share our values. It is that the facts they act on can be checked by someone who is not them. A recall grounded in a fact a third party can verify is categorically different from one grounded in a phone call: the first can be argued with, the second can only be obeyed — the difference between a citizen and a subject.

Facts a third party can check are not a courtesy a regulator extends when it feels generous. They can be infrastructure — open, shared, signed, the same for the agency and the lab and the deaf engineer running a model on a single-board computer. That is the thing I have spent the last stretch building. I did not arrive at it from policy; I arrived at it from making robots that act in the physical world and needing to prove, to someone other than myself, what they did and why.

Why the Case Cannot Rest on Trusting Intentions

The rest of this essay refuses, on principle, to argue from anyone’s good faith. Anthropic is, by my reading, the clear leader in frontier large language models and in the products built on them. Capability of that kind is dual-use in the deepest sense — the same system that drafts a safety case can draft an attack, the same model that tutors a child can coach a chemist, the same fleet that inspects a field can survey one.

Now hold two facts together: that capability is this powerful, and it is this concentrated in a clear leader. That combination is exactly the condition under which no governance founded on the good faith of whoever holds the instrument can be called sound — not because the holder is untrustworthy, but because soundness cannot be a function of the holder at all. A governance arrangement that works only so long as the most capable party stays well-intentioned is a hope with a logo on it, and hopes do not survive a change of ownership, incentive, or administration.

So I will not build the case on what any actor wants or will do. A verifiable evidence layer makes power answerable whether or not the party holding it is well-intentioned. Anthropic is the reason the case must rest on neutral, cooperatively adopted, checkable infrastructure rather than on anyone’s character — especially the character of whoever turns out to be best at this. Trust the company if you like; do not build the architecture on it.

What a Guardrail Has to Be Before It Deserves the Name

Most of what gets called a guardrail for an AI system is a filter wrapped around a model: a prompt goes in, the model proposes an output, and something downstream decides whether to let it through. That is adequate for a chatbot, where the worst case is usually a bad sentence. It is not adequate for AI that acts — a robot arm, a dispatcher agent, anything that closes a loop on the physical world or on someone else’s account. When the output is a motion command, block the bad sentence is the wrong frame: by the time you inspect the action it has already been computed.

This is the hinge on which the ancient question reopens. The apparatus of accountability was built around the fact that a person can be made to give an account: to be named, asked under what authority and with what confidence they acted, and have their deeds entered in a record that outlives the moment. A machine that acts autonomously slipped into the world without any of that attached. It can move an object, deny a claim, dispatch a vehicle — and unless we build the apparatus in deliberately, it does so leaving behind nothing but a log its own operator can edit. The question is whether we will make these machines answerable the way we eventually made every other form of power answerable, or let them become the one consequential actor in the modern world no one outside the room can check. That choice is open now and will not stay open, because the systems are being deployed and their conventions set now, by default if not by decision.

So before I describe any machinery, here is the bar. A guardrail you can rely on for acting AI has four properties: it must be structural, attributable, tamper-evident, and independently replayable. Most of what is sold as a guardrail has one or two. Together they are simply the old apparatus of accountability — the rule fixed in advance, the named and confident decider, the unalterable record, the account a stranger can verify — translated into something a machine can carry.

Structural means the safety check is enforced before the action, not noticed after it. I build this as a kernel/userland split. The probabilistic AI — the vision-language-action model, the reasoning model, whatever is doing inference — lives in userland. It does not get to do anything; it only gets to request. “Pick up the cup” is a request. Beneath it sits a deterministic layer that validates that request against hard constraints: physics, the robot’s own envelope, local safety law. The brain requests a trajectory; the kernel grants it only if it complies. A filter that runs after the model is a layer the model can, in principle, talk its way through — and a system clever enough to be useful is clever enough to be persuasive. A layer that runs below the model is one it cannot reach. In RCAN this appears at the wire: a command from a principal that lacks the required scope is rejected at the protocol layer — not by the application, not by the model, by the spec. The deterministic part is small and dumb on purpose, because that is what a human can audit. The part on which everyone’s safety rests must be legible to the least sophisticated reviewer, not the most.

Attributable means every safety-critical action carries, in the record, who decided it and how sure they were. For a human-programmed weld at a fixed coordinate that is dull. For an AI inference it is the whole game: when a model decides “the object at position X is safe to interact with,” that decision rode on a confidence distribution, a specific model version, an inference context. So the RCAN audit record stamps three things onto each AI-driven action — model identity, confidence score, human-in-the-loop gate status. Without them, “the AI did it” is the end of the inquiry, and an actor at whom inquiry ends is, by definition, beyond accountability. With them you can ask: which model, how confident, and was a human supposed to be in the loop.

Tamper-evident. A log file is not a forensic record. It is a text file, editable by anyone who can write to disk — and a sophisticated adversary, given a plain hash-per-line scheme, can recompute valid hashes after rewriting history. What you need is a keyed chain. RCAN uses an HMAC-SHA256 append-only chain keyed with a session-bound secret that lives in memory and is never written to the log. Even with full access to the JSONL output you cannot forge a valid record without the key. A record the operator can quietly revise is not an account; it is a story. The keyed chain is the difference between a log that describes what happened and a record that can establish it when the two diverge — which, in a warehouse shared with people or a medical environment, is exactly when the record matters.

Independently replayable is the property most systems quietly skip, and the one I care about most, because it decides who holds the power. The party checking the evidence must not have to be the operator, and must not have to be the model that acted; conformance has to be re-checkable from the signed facts alone. If verifying a deployment requires taking the operator’s word, or re-running the frontier model that made the call, you have not built verification — you have built a more elaborate way of taking someone’s word for it, and a sophisticated way of taking the powerful at their word is more dangerous than a naive one, because it borrows the credibility of verification while delivering none of its substance. Standards say what to demonstrate; RCAN provides the plumbing that makes a demonstration something a stranger can re-run.

Those four properties are the bar — and an argument that the bar itself is what fairness in machine governance requires. Not a nice-to-have. The minimum below which the word governance describes a hope rather than a fact.

The Protocol and the Place That Holds Identity

If fairness comes from facts a third party can check, then I owe you the actual layer, not a sketch. There are two pieces: a wire protocol for machines that act, and the institution that holds the identities, because a record is only as trustworthy as the names attached to it, and names require somewhere neutral to live. I built both, mostly by pairing with Claude Code, with a SO-ARM101 arm named bob on the bench. Everything below is running, not proposed — and that distinction is load-bearing: an argument about how power should be made answerable carries weight in proportion to whether the author has built the thing, on hardware, where the claims can fail.

RCAN, the protocol for AI that acts

Start with tamper-evidence, because it is load-bearing for the rest. Every safety-relevant event goes into an append-only audit chain. Each entry’s chain_hash is HMAC-SHA256(chain_secret, prev_chain_hash || payload_hash). The 32-byte chain_secret lives in memory and is never written to the log. Someone with full read and write access to the file on disk still cannot forge a consistent entry or quietly delete one, because they do not hold the secret that links the chain. The log proves its own integrity to a verifier who was never on the machine.

Attribution lives in §16, the AI-accountability layer. Every safety-critical action carries the model identity, the model’s confidence, and the human-in-the-loop gate status. A real record reads: confidence 0.91, model Qwen2.5-7B, gate satisfied. That is not a dashboard metric; it is in the signed record, the kind of fact that can be held against the actor later.

The gates run before dispatch. A command that fails the ConfidenceGate returns CONFIDENCE_GATE_FAIL and MUST NOT be dispatched — the actuator never moves. A command that trips a human-in-the-loop rule emits PENDING_AUTH and waits for an OWNER or CREATOR authorization before anything happens. The rule exists prior to the deed, the way a law must exist before the act it judges; a constraint imposed only after the actuator has moved is not a guardrail, it is a eulogy.

For the EU AI Act Article 50 disclosure obligation, AI-generated output is watermarked with rcan-wm-v1 HMAC tokens — regex-detectable, so a downstream party can flag machine-authored content without trusting its producer.

Signing is hybrid. pqc-hybrid-v1 is Ed25519 plus ML-DSA-65 (FIPS 204), and both halves must verify or the signature is rejected. ML-DSA-65 is not free — a 1,952-byte public key and a 3,309-byte signature against Ed25519’s 32 bytes — but I carry that cost deliberately. The threat is harvest-now-decrypt-later: a robot registered in 2026 may still be running in 2032, and an adversary can store today’s classical signatures and forge them once quantum hardware arrives. The post-quantum half is insurance on identities meant to outlive the assumptions under which they were signed — fitting, because the accounts that matter most are often called for long after the act, by people who were not there.

Transports are tiered, and the bottom tier is where the philosophy meets the body. A full RCAN message is 400–800 bytes; over LoRa at SF12 that is roughly 25 seconds in the air — not an emergency stop. So there is a minimal binary ESTOP-only frame that transmits in about one second. One honest note: the prose around this says 32 bytes; the implementation asserts 40. I am leaving both numbers visible rather than rounding the discrepancy away, because the integrity of an evidence layer is exactly the willingness to show the seam.

The safety semantics are non-negotiable by design. Local safety always wins. ESTOP bypasses every authorization check at every layer — no matter what role a token claims, owner, creator, cloud operator, an emergency stop is never gated. Everything else in this system is built to make power answerable; the stop is built to answer to no one, because that is the one place where deference would be the failure.

The Robot Registry Foundation, the institution

A protocol needs somewhere neutral to anchor identity, and that is the Robot Registry Foundation. Every robot receives an RRN and every model an RMN — sequential, twelve-digit, zero-padded, permanent identifiers that survive a hardware swap or an OS reinstall. bob keeps RRN-000000000001 no matter what I do to the Pi underneath it. There is an RCN for components and an RHN for the harness. Each robot resolves through a Robot URI — rcan://registry.rcan.dev/manufacturer/model/version/device-id — which is DNS meets OAuth meets ROS: a name that resolves, an identity you can authenticate.

Registrations are Ed25519-signed, and the signatures are re-verified server-side on a live endpoint — not parsed and trusted, re-checked. The private signing key never leaves the device; only the public key goes to the registry. What the registry holds is replayable attestation bundles: enough signed facts for anyone to re-derive the conclusion themselves. It does not ask to be believed; it hands you what you need to stop having to believe it.

On top of that sit the first five EU AI Act-aligned endpoints, each mapping an RCAN section to a named Article: FRIA (§22 / Art. 27), Safety Benchmark (§23), Instructions for Use (§24 / Art. 13(3)), Incidents (§25 / Art. 72 post-market), and EU Register submission (§26 / Art. 49). A deployer can generate a signed FRIA from live conformance data instead of writing a PDF nobody can check — the difference between an attestation and a brochure.

And here is the line written verbatim into the spec README: conformance is self-asserted via signed bundles and independently replayable… Conformance is not certification. I am not standing up a body that blesses robots; I am standing up a place that holds signed facts anyone can re-run. The reason it must be neutral is also in the spec: a registry controlled by one company or one government is worse than no registry at all. A captured registry launders the operator’s word into the appearance of independent proof — and a counterfeit of accountability is more corrosive than its plain absence, because it disarms the suspicion that would otherwise keep the powerful honest. The history of accountable institutions is in large part the history of this exact failure: the audit that answers to the audited, the court that answers to the crown, the standard written by the firms it governs. The neutrality is not an ornament on the design. It is the design.

The Receipts, and Who Can Read Them

Architecture is cheap; anyone can draw a diagram of an accountable system. The test is whether the thing exists, runs, and can be checked by someone with no stake in my being right. Here are the receipts, beginning with the part that does not run in a browser.

There is a robot named bob. He is a SO-ARM101, a six-degree-of-freedom arm, driven by a Raspberry Pi 5 with a Hailo-8 accelerator doing 26 TOPS on-device, registered as RRN-000000000001 — the first registration in the Robot Registry, the reference robot. When I say something works, I mean it works on bob; bob is the production environment. He holds the highest conformance tier the registry defines, L5, and completed the first registry-attested spatial-eval submission. Not a slide about a robot. The robot.

The conversation neither of us scripted

On March 15 at 16:52 PDT, bob and a second arm — Alex, RRN-000000000005, another Pi 5 — planned a sorting task together. Partway through, Alex hit a real fault: its shoulder_lift motor, an STS3215 servo, threw a voltage fault. Real hardware, roughly fifteen dollars to replace. From that fault Alex reasoned on its own that it could not “stack” objects the way the plan assumed, and proposed “push/slide” instead. Neither response was scripted; I did not feed Alex a line about the motor. The motor faulted, Alex noticed it could not do the thing, and it adapted.

This is exactly the kind of moment the old apparatus of accountability was built for, and the kind that, without a record, vanishes. They ran a live Protocol 66 checklist while they worked: local_safety_wins set true, a ten-second watchdog, an action confidence gate at 0.7, ESTOP active. The measured conformance came out to 87 percent. The other 13 percent is the whole point: there is no dedicated safety MCU, no physical ESTOP button, and the force and thermal sensors are present but unwired. Those gaps are itemized, by name, in the attestation. I did not round 87 up to “fully compliant.” A number that names its own shortfall treats the reader as a party entitled to check, rather than an audience to be reassured.

The part you can run yourself

The hardware proves it is real; the demo layer proves it is checkable, the harder and more important claim. The ClaudeFarms / FieldOps demo runs Anthropic’s own managed-agent fleet: one Opus-4.8 coordinator — agent_017YoQYZ81CC8VsLrqkeg8Dx, model claude-opus-4-8 — fanning out to a drone and two rovers running on Haiku-4.5. Nothing in that fleet actuates without a farmer-approved, host-signed, registry-verifiable RCAN order. The approval, the signature, and the registry check are not optional steps a careful operator might add; they are the only path to actuation — the difference between a safety property you hope for and one the system cannot route around. The most capable fleet in the room actuates only through facts a cheap verifier can check, and that ordering is the design, not a courtesy.

The move that matters is the one a reviewer checked live, with their own hands. There is an offline judge — python3 bin/judge_verify_local.py. It needs python3, the cryptography library, the proof JSON, and a public key. That is the entire dependency list. No robot. No gateway. No private key. You hand it the signed envelopes from a real run and it exits 0: three authorized actuations whose Ed25519 signatures verify, and one refusal — HTTP 403, an unregistered kid — correctly turned away. Then you hand it a deliberately forged copy of the same proof, and it catches it: “signature does NOT verify (payload differs from what was signed).” A cheap script, beholden to no operator, re-derives the truth from signed facts alone.

That command line is the entire thesis. It is the least-powerful party in the room re-checking the most powerful one — and winning, not because it is stronger, but because the facts were structured so that strength is not what the check requires. The operator with the robot, the lab with the frontier model, the state with the recall authority — none of them is the arbiter of what happened. A few hundred lines of Python and a public key are.

And conformance is replayed, not asserted. The in-browser gateway agrees with the real published packages — robot-md-gateway 0.5.0a3, rcan 3.4.1, robot-md 1.10.4 — 8 of 8 on decisions, 4 of 4 on confidence, 9 of 9 on trust-lifecycle. The canonical JSON is byte-identical across Python and TypeScript: 12 of 12 fixture cases re-derived byte-for-byte. The live proof panel shows green and red lamps a viewer can re-check, not screenshots of lamps. The demo site returns HTTP 200 right now.

The honest limits are surfaced in the product, not in a footnote. The actuation in the public demo is simulated and labeled as simulated; the operator key is local and illustrative; a file — simulated.json — is the exact real-versus-simulated manifest. Candor is a feature here, not a confession: telling the reader plainly what is real and what is staged costs the builder nothing he should be afraid to pay.

How it got built

I built most of this mostly solo, by pairing with Claude Code. The lifetime number is roughly 8,803 commits across about 123 repositories, roughly two-thirds AI-co-authored. One four-day sprint shipped a spec, three runtimes, one firmware, and one docs site, running at 98 percent cache reads. I will not dress that up with a lines-of-code count: 96 percent of those lines were machine-generated, and reporting volume as if it were my labor would be dishonest.

The edge work is the reason the least-powerful verifier in my design is a small offline model rather than a rhetorical flourish. LiveCaptionsXR, the captioning tool I build for people who navigate the world the way I do, runs on a model footprint of roughly 430M parameters, and the language model beneath it, LFM2-1.2B, fits in about 0.75GB. A model that small runs on hardware a single person owns, beholden to no operator and no cloud, and is exactly the thing that can sit at the end of the chain and re-check a deployment it had no part in producing. The verifier does not have to be the biggest model in the room. It has to be the one with no stake in the answer.

The honesty of the receipts is the argument, not a frame around it. An 87 percent that names its own gaps, a forged proof caught in public, a metric rejected for being flattering — that is precisely the behavior a process without published triggers and written basis structurally punishes, and the difference between a field you can govern and one you can only fear is whether candor is cheap or expensive.

Why Rivals Can Agree on This

The hard question for any shared standard is not whether it is good. It is how a good standard ever gets adopted by parties who do not trust one another and were not built to cooperate. Governments are rivals; frontier labs are competitors; the downstream user trusts neither. If the case for an evidence layer depended on those parties wanting the same things, it would be dead on arrival — because no sound architecture can rest on anyone’s good faith.

So it must rest on something else: the thing durable order among rivals has always rested on. Shared infrastructure spreads not because parties guess one another’s intentions correctly, but because cooperating states and competing firms converge on a common, checkable standard worth more to each of them than fragmentation is — and because a fact anyone can verify needs no shared values and no mutual trust, only a shared instrument of verification. Two parties who agree on nothing else can still agree on what a meter is, what a signature proves, what a test measures. Facts are neutral in exactly the way values are not.

The precedents are not thin. The clearest is the European Union — standing proof that many states, with divergent interests and a long history of conflict, can reach a shared and checkable regime by agreement and mutual recognition rather than by conceding sovereignty to a benevolent center. The single market did not require members to share aims; it required them to share standards. CE marking lets a product made under one state’s inspection be sold in all of them, because the mark refers to a checkable conformity, not to affection between governments. CEN, CENELEC, and ETSI turn contested questions of “is this safe, is this interoperable” into written specifications a party in any member state can verify. GDPR made a checkable obligation portable across borders, and the AI Act is the same instinct reaching into this very domain: defined obligations, declared in advance, attached to risk tiers, recorded in a public register. The EU is not a story about rivals learning to trust each other; it is a story about rivals agreeing on instruments of verification so that they would not have to.

And it is not only Europe, and not only states. ICANN holds the internet’s name system through multistakeholder governance precisely because no single government could be trusted with it and fragmentation would have broken the network for everyone. ISO and IEC turn “does this conform” into documents a buyer in one country and a seller in another can both check. ICAO made civil aviation safe enough to cross every border on earth by converging rival aviation authorities on verifiable standards, and the maritime safety regime did the same for the sea. The metric system replaced a continent of incompatible local measures with one checkable reference. Mutual-recognition agreements let one jurisdiction accept another’s conformity assessment because the assessment is a thing you can examine, not a favor you have to trust. Even export-control coordination — among parties explicitly adversaries on the capabilities being controlled — works by converging on shared, checkable lists. In none of these did the parties first come to share aims; they converged on a common instrument worth more to each than going it alone.

The same holds beneath competitive markets, where the parties are not even pretending to cooperate. TCP/IP is a public substrate every firm builds private products on top of; they compete fiercely above it precisely because they do not compete on it. The shipping container reorganized world trade because a box any port, ship, or rival carrier can handle is worth more to all of them than a proprietary box would be to any one; USB did the same for devices; ISO 20022 lets competing banks settle over a shared, checkable format. In every case the shared standard is the neutral floor beneath competition, not the place competition happens.

This is why an open evidence layer for machines that act can spread the way those did — and the concrete lever is duller and more reliable than persuasion. It is procurement. Governments buy robots and AI systems and write the terms they buy under. A government can require, as a condition of purchase, that a deployed system speak an open, replayable evidence protocol and emit signed conformance a third party can check. That is not a prediction about what any government wants; it works the moment a contract is signed, because a procurement requirement does not need the seller’s goodwill, only the seller’s signature on a purchase order. And procurement requirements are the most contagious standards there are: once one large buyer requires conformance, suppliers build it in by default, the next buyer inherits a market where it is already the norm, and multi-government agreements can make that default explicit across borders the way mutual-recognition agreements already do.

SAE J3016, the autonomy taxonomy with the L0-through-L5 levels everyone now quotes, shows what the absence of such a shared instrument costs. It was published in January 2014; a major self-driving program had been running since 2009. So for five years the most consequential robotics effort of its era ran without a shared vocabulary for what “the car is driving” even meant, and the gap was not harmless: NHTSA’s EA22-002 named a critical safety gap tied in part to that ambiguity, and people died inside it. The lesson is the opposite of the intuitive one. Standards that define things precisely, before the ecosystem fragments, enable faster development, not slower — they stop everyone from inventing private, incompatible meanings for “supervised” and “safe,” and a private meaning of “safe” is how the worst failures arrive wearing the language of caution. The window for robotics is still open; it will not stay open, and the cost of the wrong defaults is not measured in slower releases. It has, before, been measured in lives.

There is one more reason the standard must be neutral, and it is the same reason I refused at the outset to build the case on anyone’s intentions. An evidence layer is itself an instrument of power. If a government can compel an AI company to offer an unrestricted API, then the safety constraints at the runtime layer should not be removable by changing an API key — those are different layers and should fail independently. An open protocol does not stop a determined state actor; architecture is not sovereignty. But it shapes what is possible and, more to the point, what is visible: mandatory audit trails and protocol-level access control raise the cost of quietly removing a constraint, for everyone, because removal stops being silent. The deepest protection a verifiable layer offers is not that it makes the wrong act impossible. It is that it makes the wrong act legible — and a power that must act in the open, under a standard its rivals also hold, is already more answerable than one that can act in the dark. That is exactly why the standard cannot belong to any one of the parties it constrains: a checkable fact is neutral only so long as the instrument of checking is neutral.

Five Things to Codify, and Why Each One

So here is what I am asking governments to do — each ask pointed at a failure that has already happened, not a hypothetical.

The headline is short: a government can have the EU AI Act’s predictability without its bureaucracy. The EU got right that obligations should be defined in advance and scaled to consequence, recorded in a register anyone can read. It got wrong that you need a licensing apparatus and a roster of designated certifiers to make that work. If the facts are signed and replayable, you can have defined obligations and a public register without anyone in a government office deciding who is allowed to ship. The five proposals below are the version of that I drafted, with full reasoning, at rcan.dev/policy.

1. Codify transparent intervention standards. The Fable 5 order arrived as a verbal notice citing a narrow jailbreak: no published trigger, no written basis, no channel to contest it. Replace discretionary directives with published evidence thresholds — the specific, written condition that authorizes an intervention — and a due-process channel for the party on the receiving end. If a deployment is going to be stopped, the order should say what fact stopped it, and the operator should be able to answer that fact on the record. This is the ordinary shape of administrative action, applied at last to a domain that skipped the requirement.

2. Accountability at the deployment layer. A narrow jailbreak is a property of a deployment, not of model weights; pulling Fable 5 and Mythos 5 wholesale punished a capability for the failure of a configuration. Hold specific deployments accountable through verifiable guardrails — structural facts, not policy promises — instead of broad model recalls. RCAN already provisions them: model attribution on every action, HMAC-SHA256 append-only chains so the log cannot be quietly edited, structural confidence and human-authorization gates that run before dispatch, and AI output watermarking. A regulator can scope an intervention to the deployment that actually failed and check the claim, rather than removing a capable model from everyone because of one operator’s configuration.

3. Safe harbor for transparency. The direct fix for the structural problem the recall sits inside. Providers who disclose limitations and maintain verifiable guardrails should get procedural protection: disclosure plus a signed, replayable evidence trail buys a defined process instead of a surprise. A society’s institutions are, in the end, the running total of which behaviors it has chosen to make cheap. A lab should be able to tell the truth about what its system cannot do without that sentence becoming the basis for pulling it.

4. Risk-based tiering for embodied AI. The predictability-without-bureaucracy mechanism, made concrete for machines that act. RCAN’s conformance levels, L1 through L4, give a ladder mapped to consequence: low-consequence agents carry light obligations, a robot that can put a human in a hospital carries heavy ones, and the tier is declared, not negotiated case by case in a verbal phone call. An operator can read the tier and know what is required before building — the entire value the EU framework offers, minus the office that hands out permission slips.

5. Recognize open registries. The one that decides whether any of the others stay honest. Do not build a closed government system to hold this; recognize neutral, vendor-independent registry infrastructure instead. The Robot Registry Foundation already maps to the EU AI Act — the first five endpoints cover FRIA, the Safety Benchmark, the IFU, Incidents, and the EU Register — with per-robot and per-model identity, Ed25519-signed registrations, and replayable attestation bundles. Conformance is self-asserted and independently replayable, not third-party certification: an operator asserts conformance, signs it, and anyone — a regulator, an insurer, a small offline model beholden to no one — can replay the signed facts and check whether the assertion holds. It is the kind of standard rivals can converge on, because it asks them to share an instrument of verification, not a set of values.

That is the whole ask. Five proposals, none requiring a new licensing regime, a black box, or anyone’s permission to ship.

The Offer, and Why I Will Not Keep It

RCAN and the Robot Registry Foundation can be the neutral, open evidence layer a fair process runs on — no licensing regime, no black box, no vendor lock-in.

And I want to be plain about what I intend to do with it, because the intention is part of the argument. I will donate the RCAN and RRF codebases to a neutral steward — Anthropic, or another positioned to advance transparency and safety — and hand over the governance with them. A public-safety evidence layer for machines should be neutral, legible, signed, and owned by no single vendor. That includes me. Building most of this stack solo proves the thing works; it does not make me the right long-term owner of a standard that other people’s safety depends on. The whole argument of this essay is that a layer meant to make power answerable cannot itself answer to a single party — and I am a single party. To hold onto it would be to contradict, with my own hands, the thing I have spent the essay claiming.

I am aware of the tension in naming Anthropic as a possible steward, and the resolution is the whole point. I have argued that Anthropic is the clear leader in frontier models, a concentrated and dual-use power, and that precisely for that reason no governance can rest on its good faith. Both things are true at once and they do not conflict, because a steward in this design is not asked to be trusted. The steward holds a layer whose entire property is that it can be checked without the steward — replayable from signed facts, verifiable by anyone, governed by a board on which no single constituency holds the vote, so that holding it confers no power over the facts. A steward you would have to trust would be disqualified by its capability; a steward of a layer no one has to trust is not.

This is not a change of heart announced on the way out. The neutrality is already structural. The spec is CC BY 4.0; the SDKs are MIT and Apache-2.0, chosen precisely because they cannot be revoked by a future version of me with different incentives. I keep the open standard deliberately separate from my commercial work: OpenCastor is the open-source, lite version of the stack, and a separate commercial product is my business. The moment the open standard becomes a feeder for the commercial product, the registry has quietly answered to a vendor again. And the precedent is on the table, not hypothetical: robot-md, the ROBOT.md reference implementation, is Apache-2.0 with governance open, and I have already offered to transfer that repository to a neutral home. The donation is the same move, made larger and explicit.

I know how this reads if you have watched standards efforts before: one person, convinced the problem is real, holding a thing nobody else has asked for yet. That is how nearly every durable open standard has begun — ICANN began as a hand-kept list, correct before it was an institution. The reason it had to become one anyway is the entire point: a list maintained by one party is not yet legitimate, no matter how correct, because a correct record held by a single owner is still held at that owner’s discretion — the precise thing this whole layer exists to abolish. Handing it off is not generosity. It is the last structural requirement the design imposes, the one that falls on the builder.

So the governance home should look like the problem it governs. I have proposed a board of five seats: a major manufacturer, an AI provider — the seat I would expect Anthropic to hold — an academic safety researcher, a standards professional, and a regulator. No single constituency holds the vote. The codebase comes with that structure, not in place of it.

Here is what I am actually asking, by audience.

To regulators: point policy at the five proposals at rcan.dev/policy, and recognize open registries as a legible source of facts. The lever you already hold is procurement: require conformance to an open evidence layer in what you buy, and coordinate that requirement with other governments the way mutual-recognition agreements already do.

To frontier labs: adopt the safe-harbor posture, and help steward the layer. Anthropic is the clearest case of why this matters and the clearest candidate to help hold the result: the most capable party in the field is exactly the one whose accountability must not depend on its own good faith — and a layer no one has to trust is precisely the layer such a party can safely help steward, provided no one of you owns it.

To offline and edge-model builders: register your robots, run the conformance suite, and speak the protocol. You are the natural verifiers in this design: a small, cheap, offline model, beholden to no operator, can re-check any deployment from the signed chain. The honest reason to adopt it is the dull, durable one that drove every shared standard before it — it is worth more to each of you when more of you speak it.

The throughline is the same thing I have cared about since before any of this had a name. I am a deaf engineer, and I have spent a long time building so that the person with the least leverage — the one who cannot hear the room, who was not in the meeting, who does not own the cloud — can still verify what happened from the record. That is why the verifier does not have to be the frontier model that acted. It just needs the facts. The deepest reason this can be adopted at all is that facts ask so little of those who adopt them: not shared values, not mutual trust, not anyone’s good intentions, only a common instrument of verification.

We are at one of those rare moments when the conventions for a new kind of actor are still soft, still being set. Machines that act are entering the world faster than the apparatus that would make them answerable — the same gap the self-driving car stood in for the five years it ran without a word for what it was doing. We made every prior form of power answerable by forcing it to leave a record that someone other than its wielder could read, and we made that record stick across borders and between rivals by agreeing not on aims but on how to check a claim. We are deciding, in this short window, whether the machines we build to act will be the first form of consequential power in the modern world to escape that requirement, or be held to it like everything else. It is not a close question — only a question of whether we choose before the default chooses for us.

That is the offer. The facts are already there to check, the code is shipping, and the full position and the five proposals are at rcan.dev/policy.

What	How you check it
Reference robot	bob — SO-ARM101 6-DOF arm, Raspberry Pi 5 + Hailo-8 (26 TOPS on-device), registered RRN-000000000001, conformance tier L5.
Tamper-evident chain	Append-only audit chain; chain_hash = HMAC-SHA256(chain_secret, prev_chain_hash \|\| payload_hash); the 32-byte secret lives in memory, never written to the log.
Offline verifier	python3 bin/judge_verify_local.py — needs only python3, cryptography, the proof JSON and a public key; exits 0 on 3 valid actuations + 1 correct HTTP 403 refusal; catches a forged proof.
Live multi-arm incident	Mar 15, 16:52 PDT — bob + Alex (RRN-000000000005); STS3215 voltage fault; Alex unscripted switched stack → push/slide; 87% conformance, the 13% gaps itemized in the attestation.
Replayed conformance	In-browser gateway matches published robot-md-gateway 0.5.0a3 / rcan 3.4.1 / robot-md 1.10.4 — 8/8 decisions, 4/4 confidence, 9/9 trust-lifecycle; canonical JSON byte-identical across Python and TypeScript on 12/12 fixtures.
Managed-agent fleet	ClaudeFarms / FieldOps — an Opus-4.8 coordinator fanning out to a drone + two rovers on Haiku-4.5; no actuation without a farmer-approved, host-signed, registry-verifiable RCAN order.
Hybrid signing	pqc-hybrid-v1 = Ed25519 + ML-DSA-65 (FIPS 204); both halves must verify — against harvest-now-decrypt-later on identities that outlive their assumptions.
ESTOP priority	Emergency stop bypasses every authorization check at every layer; no token — owner, creator, or cloud operator — can gate it.

Facts a third party can check.