Skip to main content

Essay · A position on AI governance · June 2026

Facts a third party can check.

After the US pulled Fable 5 and Mythos 5 on a verbal notice, the fix isn’t different values — everyone agrees unsafe deployments should be blockable. What’s missing is a shared, open, verifiable evidence layer for AI that acts. Here’s what it is, proof it’s real on real hardware, and why I’ll donate it.

“Government should be able to block unsafe deployments — but through a statutory process that is transparent, fair, clear, and grounded in technical facts.” The values aren’t missing. The evidence layer is.

By Craig Merry · ~57 min read · Sacramento, California

By the numbers

Everything here is running, signed, and checkable. The load-bearing facts:

10 days
Between EO 14409’s voluntary, no-preclearance framework and the verbal recall of Fable 5 & Mythos 5
87%
Measured conformance on the live bob + Alex run — with the missing 13% itemized by name, not rounded up
8,803
Lifetime commits across 123 repositories, ~two-thirds AI-co-authored, mostly solo with Claude Code
~430M
People with disabling hearing loss served by LiveCaptionsXR — fully on-device (LFM2-1.2B in 0.75 GB)
5
EU AI Act-aligned RRF endpoints live: FRIA, Safety Benchmark, IFU, Incidents, EU Register
~1s
Minimal binary ESTOP frame in the air over LoRa, versus ~25s for a full RCAN message at SF12

Last week a government, acting on a phone call, ordered Anthropic to pull its two most capable models. No published finding. A verbal notice of a “narrow jailbreak.” The first time any state has ordered a frontier lab to take a large language model offline. I do not think the goal was wrong. I think the process was — and that is the fixable part, the part worth building against before the shape of it hardens into precedent.

That was the whole of my first reaction, written the day it happened. I have spent the weeks since trying to argue myself out of it, and I cannot. So here is the long version, because the short version keeps being read as a complaint about a decision, and it is not. It is an argument about what makes a decision answerable at all — and about a window, now closing, in which we still get to choose whether the machines we are building to act in the world will leave a record that anyone but their operators can read.

In June 2026 the United States ordered Anthropic to withdraw Fable 5 and Mythos 5 — at the time its two most capable models — from deployment. The instrument was a verbal notice citing a narrow jailbreak. Not a written finding. A phone call, in effect. As far as I can determine, and I have looked, this is the first time any state has ordered a frontier lab to take a large language model offline; the first export-style ban on access to an LLM. Whatever else it is, it is a precedent. And precedents are the part of any system that compounds — the part that, once set, decides cases nobody had yet imagined when it was set.

I want to be exact about what I am and am not objecting to, because that distinction is the entire essay, and the whole of what follows depends on getting it right at the start.

The Question Underneath the Recall

Blocking an unsafe deployment is a thing a government is supposed to be able to do. I am not going to pretend that is controversial. If a model can be reliably steered into producing something genuinely dangerous, and the people running it cannot or will not close the hole, then there is a public interest in stopping the deployment until it is fixed. That is an ordinary function of the state. It is the same logic as grounding an aircraft type, pulling a drug, recalling a car. Nobody serious argues the state should have no authority here. I certainly do not.

So this is not a free-speech argument, and it is not a leave-the-labs-alone argument. The objection is narrower than either, and I think more durable: it is to the mechanism. And underneath the mechanism is the oldest question there is about power, which is not who holds it but how the rest of us know whether it has been used well.

Take the recall apart and look at what was not in the box.

There was no published trigger. Nobody can point to a written rule and say: this model crossed this line, here is the line. A narrow jailbreak is the name of a category, not a criterion. It does not tell the next lab what to avoid, and it does not tell the public what was actually at stake.

There was no written basis. The notice was verbal. There is no document you or I can read to learn what the model did, how the finding was reached, what was tested, or how grave it actually was. A verbal notice cannot be examined. It cannot be checked by anyone who was not on the call. For a deaf engineer this lands with a particular and lifelong weight. I have spent my life insisting that the load-bearing parts of a conversation be written down, because the spoken version is the one that vanishes and the one I cannot independently verify. The spoken word privileges whoever was in the room and whoever can hear it; the written record is the great equalizer, the thing that lets a person who was not present, or could not hear, hold the same facts in their hands as everyone else. A regulatory action that exists only as speech is a regulatory action that exists only for the people in the room. It is, in the most literal sense, inaudible to the rest of us.

And there was no due process — no stated standard to meet, no record of the evidence, no defined path to contest the finding or to demonstrate that the hole was closed. The decision and its explanation were the same event, and that event left no artifact behind.

None of those three absences is about whether the call was right. The model may well have carried a real and serious flaw. I have no way to know — and that is precisely the point. When the basis is unwritten, was the government right becomes an unanswerable question for everyone outside the decision. You are left trusting the operator of the process, which is the one thing we have learned, over centuries and at great cost, not to do anywhere else that matters.

It is worth saying plainly what that long lesson is, because the rest of this essay is in some sense an attempt to apply it to a new kind of actor. Every durable advance in the legitimacy of power has taken the same form. It has consisted of forcing the powerful to make a record — to commit, in advance and in writing, to the rule they will be judged by, and to leave behind, after the fact, an account that someone other than themselves can read. The written law replaced the remembered custom that only the elders could recite. The trial replaced the ordeal whose outcome only the administering authority could interpret. The public ledger replaced the private word of the merchant about what was owed. The published standard replaced the guild secret. In each case the move is identical, and it is not a move toward kindness or toward trust. It is a move away from trust — toward verifiability, which is the thing you reach for precisely because trust has proven insufficient. Legitimacy, in the only sense that survives contact with self-interest, is not the feeling that power is being used well. It is the standing ability of the governed to check.

This is what I mean when I say the recall was the wrong process and not the wrong goal. The goal — stop the dangerous deployment — is one we share. The process reached back past every one of those hard-won advances and decided the case the old way: by the unrecorded word of the authority, examinable by no one outside the room. That is not a small procedural lapse. It is a reversion, in a domain that cannot afford one, to the form of power that the entire architecture of accountable governance was built to replace.

Here is the detail that turned a bad week into something I felt I had to write. The recall came ten days after Executive Order 14409 established a voluntary, no-preclearance framework for frontier models — a deliberate choice not to make labs get permission before they ship. Ten days. The ink was effectively still wet on a policy that said we will not gate you up front, and then a model was gated, by phone, with no written basis at all.

A Voluntary Policy and an Unwritten Power

Set the two dates beside each other, because the contradiction between them is the whole argument.

On one side: Executive Order 14409 laid out a frontier framework that was voluntary and required no preclearance. No agency had to sign off before a model shipped. Labs were trusted to govern their own releases and to be candid about what those releases could and could not do. That was the announced posture of the United States toward frontier AI — a bet, an honorable one, on self-governance and disclosure.

Ten days later, the same government ordered Anthropic to pull Fable 5 and Mythos 5 over a verbal notice of a narrow jailbreak. A hard recall. No published trigger. No written basis. No due process. Ten days.

You cannot operate a voluntary, no-preclearance framework and an unwritten recall power at the same time and call the combination a system. It is not one. It is two postures pointed in opposite directions, and the space between them is where everyone who builds is made to stand.

State the gap without ornament. Voluntary policy plus discretionary enforcement equals nobody can predict what is safe to ship. That unpredictability is not a side effect of the gap; it is the gap. When the rule you will be judged against does not exist in writing until after you have been judged, you are not operating under a policy. You are operating under a mood. And a mood is exactly the condition that the written law, the recorded trial, the published standard were all invented to abolish — because a mood cannot be anticipated, cannot be contested, and cannot be checked. It can only be suffered.

Anthropic’s own framing of what should replace this is, I think, exactly right, and worth stating in plain words because it is the bridge to everything else. A government should be able to block an unsafe deployment — but through a statutory process that is transparent, fair, clear, and grounded in technical facts. Read that list slowly. Transparent. Fair. Clear. Grounded in technical facts. Not one of those four words is about values. Every one of them is about process and evidence.

That is the line I keep returning to, because it dissolves the argument that everyone expects to have. The values are not in dispute. Everyone here — the agency, Anthropic, the downstream user, me — agrees that dangerous deployments should be stoppable. What is missing is not shared values. It is the evidence layer: the common, checkable set of facts that a recall could point to, that a lab could have anticipated, that a third party could later examine to judge whether the call was sound. We do not disagree about what we want. We have no shared ground on which to verify whether we got it. And a society that agrees on its ends but has no common instrument for checking its means has not solved the problem of power. It has only postponed the moment when the disagreement becomes unanswerable.

There is a second-order harm here, worse than the unpredictability, and it took me some time to see it clearly.

Anthropic had publicly disclosed that perfect jailbreak resistance is impossible. That statement is true. It is true of every model from every lab, including the small offline ones I run on my own hardware. Saying it aloud is exactly the kind of candor a voluntary, trust-based framework is supposed to reward — it is the lab keeping its half of the bet the executive order placed. And that candor appears to be what framed the concern that pulled the models. The disclosure did not cause the vulnerability. The disclosure described a property that all such systems share. But it was the visible thing, so it became the cited thing.

Look at the structure that creates, and stay with the structure rather than guessing at anyone’s intentions, because the structure is what governs systems over time regardless of who is operating it. A regime in which describing a true limitation can be cited as grounds for a recall, while saying nothing cannot, is a regime whose published incentives run against disclosure. I am not claiming any party will or will not respond a particular way; I am pointing at the shape of the thing. A structure that makes candor risky and silence safe is a structure that selects, over many actors and many years, for less of what it ought to want most. That is a property of the mechanism, not a prediction about a person.

This is the ethical core of the thing, and it deserves to be named as such rather than left as a policy footnote. Opacity is not a neutral default that transparency improves upon. Opacity is a choice with victims — the people downstream who must trust what they cannot inspect — and an enforcement regime with no published basis does not merely fail to discourage that choice. It structurally rewards it. A regime with no written basis does not produce a single bad outcome and stop. It selects. It teaches. And what it teaches is the precise opposite of the thing that makes power answerable. That cost is not paid by one lab on one Tuesday in June. It is a property of the architecture, paid slowly, by everyone, for as long as the architecture stands.

So here is where I want to leave the diagnosis, because stating it this way turns it into something buildable rather than something to mourn. What makes governance fair is not that the people enforcing it share our values. It is that the facts they act on can be checked by someone who is not them. A recall grounded in a fact a third party can verify is a categorically different act from a recall grounded in a phone call. The first can be argued with. The second can only be obeyed. And the difference between an act you can argue with and an act you can only obey is the difference between a citizen and a subject — between living under a law and living at the discretion of whoever holds the instrument.

Facts that a third party can check are not a courtesy a regulator extends when it feels generous. They can be infrastructure — open, shared, signed, the same for the agency and the lab and the deaf engineer running a model on a single-board computer. That is the thing I think we have to build. It is the thing I have spent the last stretch actually building. And I did not arrive at it from policy. I arrived at it from making robots that act in the physical world and needing to prove, to someone other than myself, what they did and why. The rest of this is an account of what that demand looks like once you take it seriously enough to run it on hardware — and of why I have come to believe it is not one option among several but the form that accountable machine governance is going to have to take, if it is to exist at all.

Why the Case Cannot Rest on Trusting Intentions

I want to be blunt about why the rest of this essay refuses, on principle, to argue from anyone’s good faith.

Anthropic is, by my reading, the clear leader in frontier large language models and in the design of the products built on them. That is not a slight to anyone else in the field; it is a description of where the frontier presently sits. And capability of that kind is dual-use in the deepest sense — the same system that drafts a safety case can draft an attack, the same model that tutors a child can coach a chemist, the same fleet that inspects a field can survey one. A frontier model is a profound instrument for good and a profound instrument for harm, and the two are not separable into different products. They are the same capability pointed in different directions.

Now hold two facts together: that capability is this powerful, and it is this concentrated in a clear leader. Whatever you think of any particular company — and I think well of this one — that combination is exactly the condition under which no governance founded on the good faith of whoever holds the instrument can be called sound. Not because the holder is untrustworthy. Because soundness cannot be a function of the holder at all. A governance arrangement that works only so long as the most capable party stays well-intentioned is not governance; it is a hope with a logo on it, and hopes do not survive a change of ownership, a change of incentive, or a change of administration. The more powerful and the more concentrated the capability, the less tolerable it is to make its accountability depend on the disposition of the party who wields it.

This is why I will not build the case on what any actor wants or will do. I do not need to. The virtue of a verifiable evidence layer is precisely that it makes power answerable whether or not the party holding it is well-intentioned — that is the entire point of reaching for verifiability instead of trust. The argument has to be structural, because the thing it is meant to constrain is too powerful and too concentrated to be left to anything softer. Anthropic is the reason the case must rest on neutral, cooperatively adopted, checkable infrastructure rather than on anyone’s character — including, and especially, the character of whoever turns out to be best at this. Trust the company if you like. Do not build the architecture on it.

What a Guardrail Has to Be Before It Deserves the Name

Most of what gets called a guardrail for an AI system is a filter wrapped around a model. A prompt goes in, the model proposes an output, and something downstream decides whether to let that output through. That arrangement is adequate for a chatbot, where the worst case is usually a bad sentence. It is not adequate for AI that acts — a robot arm, a dispatcher agent, anything that closes a loop on the physical world or on someone else’s account. When the output is a motion command, block the bad sentence is the wrong frame entirely. By the time you are inspecting the action, it has already been computed; and the thing you most need in order to hold anyone answerable for it — the record of what happened and why — does not yet exist.

This is the precise hinge on which the ancient question reopens. For most of history the things that acted consequentially in the world were people, and we built, slowly and against resistance, the entire apparatus of accountability around the fact that a person can be made to give an account: to be named, to be asked under what authority and with what confidence they acted, to have their deeds entered in a record that outlives the moment. A machine that acts autonomously is a new kind of actor that slipped into the world without any of that apparatus attached. It can move a physical object, deny a claim, dispatch a vehicle — and unless we build the apparatus in deliberately, it does so leaving behind nothing but a log its own operator can edit. The question is not whether such machines should exist. They already do. The question is whether we will make them answerable the way we eventually made every other form of power answerable, or whether we will let them become the one consequential actor in the modern world that no one outside the room can check. That choice is open now and it will not stay open, because the systems are being deployed now and the conventions that govern them are being set now, by default if not by decision.

So before I describe any machinery, I want to be exact about the bar — because the bar is itself the argument. A guardrail you can actually rely on for acting AI has four properties. It must be structural, attributable, tamper-evident, and independently replayable. Most of what is sold as a guardrail has one or two of these. The whole of this essay rests on insisting on all four, because each closes a hole the others leave open, and because together they are simply the old apparatus of accountability — the rule fixed in advance, the named and confident decider, the unalterable record, the account a stranger can verify — translated into something a machine can be made to carry.

Structural means the safety check is enforced before the action, not noticed after it. The way I build this is the kernel/userland split. The probabilistic AI — the vision-language-action model, the reasoning model, whatever is doing the inference — lives in userland. It does not get to do anything. It only gets to request. “Pick up the cup” is a request. Beneath it sits a deterministic layer that validates that request against hard constraints: physics, the robot’s own envelope, local safety law. The brain requests a trajectory; the kernel grants it only if it complies. The safety layer runs below the AI layer, and that ordering is the entire point. A filter that runs after the model is a layer the model can, in principle, talk its way through — and a system clever enough to be useful is clever enough to be persuasive, which is exactly the wrong property to leave on the wrong side of the guard. A layer that runs below the model is one the model cannot reach. In RCAN this appears at the wire: a command from a principal that lacks the required scope is rejected at the protocol layer — not by the application, not by the model, by the spec. The deterministic part is small and dumb on purpose, because small and dumb is what a human being can actually audit. There is a quiet ethics in that smallness. The part of the system on which everyone’s safety rests is the part that must be legible to the least sophisticated reviewer, not the most.

Attributable means every safety-critical action carries, in the record, who decided it and how sure they were. For a human-programmed weld at a fixed coordinate, that question is dull — the action is deterministic. For an AI inference it is the entire game. When a model decides that “the object at position X is safe to interact with,” that decision rode on a confidence distribution, a specific model version, an inference context. So the RCAN audit record stamps three things onto each AI-driven action: the model identity, the confidence score, and the human-in-the-loop gate status. Without those, “the AI did it” is the end of the inquiry — and an actor at whom inquiry ends is, by definition, an actor beyond accountability. With them, you can ask the next question, which is the question that makes an actor answerable at all: which model, how confident, and was a human supposed to be in the loop.

Tamper-evident. Here is the sharp line, stated plainly: a log file that records what a robot did is not a forensic record. It is a text file. A text file can be edited by anyone who can write to disk, and a sophisticated adversary need not even be crude — given a plain hash-per-line scheme, they can recompute valid hashes after rewriting history. What you need is a keyed chain. RCAN uses an HMAC-SHA256 append-only chain keyed with a session-bound secret that lives in memory and is never written to the log. Even with full access to the JSONL output, you cannot forge a valid record without the key. That is the difference between a log that describes what happened and a record that can establish what happened when the two diverge — which, in a warehouse shared with people or in a medical environment, is exactly the moment the record matters. This is the public ledger’s oldest insight, carried forward: an account that the powerful party can quietly revise is not an account at all. It is a story. The whole value of a record is its resistance to the interests of the one who keeps it.

Independently replayable is the property most systems quietly skip, and it is the one I care about most, because it is the one that decides who holds the power. The party checking the evidence must not have to be the operator, and must not have to be the model that acted. Conformance has to be re-checkable from the signed facts alone. If verifying a deployment requires trusting the operator’s word, or re-running the frontier model that made the call, you have not built verification — you have built a more elaborate way of taking someone’s word for it, dressed in enough machinery to look like proof. And a sophisticated way of taking the powerful at their word is more dangerous than a naive one, because it borrows the credibility of verification while delivering none of its substance. RCAN is not a compliance mechanism in this sense; it is a communication protocol that makes verifiability possible. Standards say what to demonstrate. RCAN provides the plumbing that makes a demonstration something a stranger can re-run.

Those four properties are the bar. Structural, attributable, tamper-evident, independently replayable. Everything that follows is an account of a system built to clear it — and, in equal measure, an argument that the bar itself is what fairness in machine governance actually requires. Not a nice-to-have. The minimum below which the word governance describes a hope rather than a fact.

The Protocol and the Place That Holds Identity

If the claim is that fairness comes from a verifiable evidence layer — facts a third party can check — then I owe you the actual layer, not a sketch. There are two pieces. One is a wire protocol for machines that act. The other is the institution that holds the identities, because a record is only as trustworthy as the names attached to it, and names require somewhere neutral to live. I built both, mostly by pairing with Claude Code, with a SO-ARM101 arm named bob on the bench. bob is RRN-000000000001. Everything below is running, not proposed. That distinction is load-bearing: an argument about how power should be made answerable carries weight in proportion to whether the author has actually built the thing that makes it answerable, on hardware, where the claims can fail.

Hold the four properties in mind — structural, attributable, tamper-evident, independently replayable — because each thing I describe maps back to one of them.

RCAN, the protocol for AI that acts

Start with tamper-evidence, because it is load-bearing for the rest. Every safety-relevant event goes into an append-only audit chain. Each entry’s chain_hash is HMAC-SHA256(chain_secret, prev_chain_hash || payload_hash). The 32-byte chain_secret lives in memory and is never written to the log. That detail is the whole of it: if someone gains full read and write access to the log file on disk, they still cannot forge a consistent entry or quietly delete one, because they do not hold the secret that links the chain. The log proves its own integrity to a verifier who was never on the machine. This is the entire point of a record, restored to a domain that had lost it — an account whose truth does not depend on the goodwill of the party who produced it.

Attribution lives in §16, the AI-accountability layer. Every safety-critical action carries the model identity, the model’s confidence, and the human-in-the-loop gate status. A real record reads: confidence 0.91, model Qwen2.5-7B, gate satisfied. You can see which model decided, how sure it was, and whether a person signed off. That is not a dashboard metric. It is in the signed record, which means it is the kind of fact that can be held against the actor later — the kind of fact that makes the actor an actor we can govern.

The structural property is the part people underestimate, so I will be exact. The gates run before dispatch, not after. A command that fails the ConfidenceGate returns CONFIDENCE_GATE_FAIL and MUST NOT be dispatched — the actuator never moves. A command that trips a human-in-the-loop rule emits PENDING_AUTH and waits for an OWNER or CREATOR authorization before anything happens. “Enforced before the action” is a literal description of the control flow, not an aspiration. The rule exists prior to the deed, the way a law must exist before the act it judges; a constraint imposed only after the actuator has moved is not a guardrail, it is a eulogy.

For the EU AI Act Article 50 disclosure obligation, AI-generated output is watermarked with rcan-wm-v1 HMAC tokens — regex-detectable, so a downstream party can flag machine-authored content without trusting the producer. The recurring phrase is the one that matters: without trusting the producer. That is the whole design philosophy compressed into four words.

Signing is hybrid. pqc-hybrid-v1 is Ed25519 plus ML-DSA-65 (FIPS 204), and both halves must verify or the signature is rejected. ML-DSA-65 is not free — a 1,952-byte public key and a 3,309-byte signature against Ed25519’s 32 bytes. I carry that cost deliberately. The threat is harvest-now-decrypt-later: a robot registered in 2026 may still be running in 2032, and an adversary can store today’s classical signatures and forge them once quantum hardware arrives. The post-quantum half is insurance on identities meant to outlive the assumptions under which they were signed. There is something fitting in that: a record built to make power answerable should be built to outlast the moment of its making, because the accounts that matter most are often the ones called for long after the act, by people who were not there.

Transports are tiered, and the bottom tier is where I am proudest of the engineering, because it is where the philosophy meets the body. A full RCAN message is 400–800 bytes; over LoRa at SF12 that is roughly 25 seconds in the air. Twenty-five seconds is not an emergency stop. So there is a minimal binary ESTOP-only frame that transmits in about one second. That is a safety system rather than a wish. One honest note: the prose around this says 32 bytes; the implementation asserts 40. I am leaving both numbers visible rather than rounding the discrepancy away, because the integrity of an evidence layer is exactly the willingness to show the seam. A record that hides its own small contradictions has already conceded the principle on which it stands.

The safety semantics are non-negotiable by design, and I mean that literally. Local safety always wins. ESTOP bypasses every authorization check at every layer. No matter what role a token claims — owner, creator, cloud operator — an emergency stop is never gated. A guardrail you can talk your way past is not a guardrail, and the one act that must never be answerable to authority is the act of stopping. Everything else in this system is built to make power answerable; the stop is built to answer to no one, because that is the one place where deference would be the failure.

The Robot Registry Foundation, the institution

A protocol needs somewhere neutral to anchor identity, and that is the Robot Registry Foundation. Every robot receives an RRN and every model an RMN — sequential, twelve-digit, zero-padded, permanent identifiers that survive a hardware swap or an OS reinstall. bob keeps RRN-000000000001 no matter what I do to the Pi underneath it. There is an RCN for components and an RHN for the harness. Each robot resolves through a Robot URI — rcan://registry.rcan.dev/manufacturer/model/version/device-id — which is DNS meets OAuth meets ROS: a name that resolves, an identity you can authenticate, addressed the way a robot fleet actually needs to be addressed.

Registrations are Ed25519-signed, and the signatures are re-verified server-side on a live endpoint — not parsed and trusted, re-checked. The private signing key never leaves the device; only the public key goes to the registry. What the registry holds is replayable attestation bundles: enough signed facts for anyone to re-derive the conclusion themselves. The registry does not ask to be believed. It hands you what you need to stop having to believe it.

On top of that sit the first five EU AI Act-aligned endpoints, each mapping an RCAN section to a named Article: FRIA (§22 / Art. 27), Safety Benchmark (§23), Instructions for Use (§24 / Art. 13(3)), Incidents (§25 / Art. 72 post-market), and EU Register submission (§26 / Art. 49). A deployer can generate a signed FRIA from live conformance data instead of writing a PDF nobody can check — the difference between an attestation and a brochure.

And here is the line written verbatim into the spec README, the line that makes the whole thing usable as public infrastructure: conformance is self-asserted via signed bundles and independently replayable… Conformance is not certification. I am not standing up a body that blesses robots. I am standing up a place that holds signed facts anyone can re-run. The reason it must be neutral is plain, and it is also in the spec: a registry controlled by one company or one government is worse than no registry at all. This is the deepest point in the whole institutional design, and it generalizes far beyond robots. A captured registry does not merely fail to help. It launders the operator’s word into the appearance of independent proof — and a counterfeit of accountability is more corrosive than its plain absence, because it disarms the suspicion that would otherwise keep the powerful honest. The history of accountable institutions is in large part the history of this exact failure: the audit that answers to the audited, the court that answers to the crown, the standard written by the firms it governs. The neutrality is not an ornament on the design. It is the design. The moment the place that holds the facts answers to one of the parties whose facts it holds, it has become the opposite of what it was built to be.

The Receipts, and Who Can Read Them

Most of what I have described is architecture. Architecture is cheap; anyone can draw a diagram of an accountable system. The test of the argument is whether the thing exists, runs, and can be checked by someone with no stake in my being right. So here are the receipts, beginning with the part that does not run in a browser.

There is a robot named bob. He is a SO-ARM101, a six-degree-of-freedom arm, driven by a Raspberry Pi 5 with a Hailo-8 accelerator doing 26 TOPS on-device. He is registered as RRN-000000000001 — the first registration in the Robot Registry, the reference robot. When I say something works, I mean it works on bob. bob is the production environment. He holds the highest conformance tier the registry defines, L5, and he completed the first registry-attested spatial-eval submission. Not a slide about a robot. The robot.

The conversation neither of us scripted

On March 15 at 16:52 PDT, bob and a second arm — Alex, RRN-000000000005, another Pi 5 — planned a sorting task together. Partway through, Alex hit a real fault. The shoulder_lift motor on Alex is an STS3215 servo, and it threw a voltage fault. Real hardware; roughly fifteen dollars to replace. From that fault, Alex reasoned on its own that it could not “stack” objects the way the plan assumed, and proposed “push/slide” instead. Neither response was scripted. I did not feed Alex a line about the motor. The motor faulted, Alex noticed it could not do the thing, and it adapted.

This is exactly the kind of moment the old apparatus of accountability was built for, and exactly the kind of moment that, without a record, vanishes. They ran a live Protocol 66 checklist while they worked: local_safety_wins set true, a ten-second watchdog, an action confidence gate at 0.7, ESTOP active. The measured conformance came out to 87 percent. I want to be precise about the other 13 percent, because that precision is the whole point. There is no dedicated safety MCU. There is no physical ESTOP button. The force and thermal sensors are present but unwired. Those gaps are itemized, by name, in the attestation. I did not round 87 up to “fully compliant.” The missing thirteen is in the record where anyone can read it. A number that names its own shortfall is doing something a glossy claim of full compliance can never do: it is treating the reader as a party entitled to check, rather than an audience to be reassured.

The part you can run yourself

The hardware proves it is real. The demo layer proves it is checkable, which is the harder and more important claim. The ClaudeFarms / FieldOps demo runs Anthropic’s own managed-agent fleet: one Opus-4.8 coordinator — agent_017YoQYZ81CC8VsLrqkeg8Dx, model claude-opus-4-8 — fanning out to a drone and two rovers running on Haiku-4.5. Nothing in that fleet actuates without a farmer-approved, host-signed, registry-verifiable RCAN order. The approval, the signature, and the registry check are not optional steps a careful operator might add. They are the path. There is no other path to actuation. That is the difference between a safety property you hope for and one the system cannot route around — between a promise and a wall. It is also a small demonstration of the larger thesis: the most capable fleet in the room actuates only through facts the least capable party can check, and that ordering is the design, not a courtesy.

The move that matters is the one a reviewer checked live, with their own hands. There is an offline judge — python3 bin/judge_verify_local.py. It needs python3, the cryptography library, the proof JSON, and a public key. That is the entire dependency list. No robot. No gateway. No private key. You hand it the signed envelopes from a real run and it exits 0: three authorized actuations whose Ed25519 signatures verify, and one refusal — HTTP 403, an unregistered kid — correctly turned away. Then you hand it a deliberately forged copy of the same proof, and it catches it: “signature does NOT verify (payload differs from what was signed).” A cheap script, beholden to no operator, re-derives the truth from signed facts alone.

I want to dwell on that for a moment, because it is the entire thesis compressed into a command line. That script is the least-powerful party in the room re-checking the most powerful one — and winning, not because it is stronger, but because the facts were structured so that strength is not what the check requires. This is what verifiability buys that trust never can: it inverts the natural advantage of power. The operator with the robot, the lab with the frontier model, the state with the recall authority — none of them is the arbiter of what happened. A few hundred lines of Python and a public key are. That is the design holding up under someone else’s hands, which is the only test that counts. A guarantee that only its author can confirm is not a guarantee. It is a request to be trusted, restated.

And conformance is replayed, not asserted. The in-browser gateway agrees with the real published packages — robot-md-gateway 0.5.0a3, rcan 3.4.1, robot-md 1.10.4 — 8 of 8 on decisions, 4 of 4 on confidence, 9 of 9 on trust-lifecycle. The canonical JSON is byte-identical across Python and TypeScript: 12 of 12 fixture cases re-derived byte-for-byte. The live proof panel shows green and red lamps a viewer can re-check, not screenshots of lamps. The demo site returns HTTP 200 right now.

The honest limits are surfaced in the product, not in a footnote. The actuation in the public demo is simulated and labeled as simulated. The operator key is local and illustrative. There is a file — simulated.json — that is the exact real-versus-simulated manifest. Candor is a feature here, not a confession, because the entire argument has been that opacity is what ad-hoc enforcement quietly trains for, and the only way to break that training is to demonstrate, in public, that telling the reader plainly what is real and what is staged costs the builder nothing he should be afraid to pay.

How it got built

I built most of this mostly solo, by pairing with Claude Code. The lifetime number is roughly 8,803 commits across about 123 repositories, roughly two-thirds AI-co-authored. One four-day sprint shipped a spec, three runtimes, one firmware, and one docs site, running at 98 percent cache reads. I am not going to dress that up with a lines-of-code count — 96 percent of those lines were machine-generated, and reporting volume as if it were my labor would be a dishonest metric, so I am not using it.

The edge work is not abstract to me, and it is the reason the least-powerful verifier in my design is a small offline model rather than a rhetorical flourish. LiveCaptionsXR, the captioning tool I build for people who navigate the world the way I do, runs on a model footprint of roughly 430M parameters, and the language model beneath it, LFM2-1.2B, fits in about 0.75GB. That is the whole point of the number: a model that small runs on hardware a single person owns, beholden to no operator and no cloud, and a model that small is exactly the thing that can sit at the end of the chain and re-check a deployment it had no part in producing. The verifier does not have to be the biggest model in the room. It has to be the one with no stake in the answer.

That refusal is, in a small way, the whole essay. The honesty of the receipts is the argument and not a frame around it. An 87 percent that names its own gaps; a forged proof caught in public; a metric rejected for being flattering — that is precisely the behavior a process without published triggers and written basis structurally punishes, by rewarding its opposite. It should reward it. A regime that makes candor expensive will get less candor, and a regime that makes it cheap will get more, and over a long enough horizon the difference between those two regimes is the difference between a field you can govern and one you can only fear.

Why Rivals Can Agree on This

The hard question for any shared standard is not whether it is good. It is how a good standard ever gets adopted by parties who do not trust one another and were not built to cooperate. Governments are rivals. Frontier labs are competitors. The downstream user trusts neither. If the case for an evidence layer depended on those parties wanting the same things, or trusting one another’s motives, it would be dead on arrival, and rightly so — because, as I have already argued, no sound architecture can rest on anyone’s good faith.

So it must rest on something else, and there is something else, and it is the thing that durable order among rivals has always rested on. Shared infrastructure does not spread because the parties guess one another’s intentions correctly and decide to cooperate out of goodwill. It spreads because cooperating states and competing firms converge on a common, checkable standard that is worth more to each of them than fragmentation is — and because a fact anyone can verify needs no shared values and no mutual trust to be useful. It needs only a shared instrument of verification. That is the quiet engine under every treaty and every standard that has ever held between adversaries: not agreement on ends, but agreement on how to check a claim. Two parties who agree on nothing else can still agree on what a meter is, what a signature proves, what a test measures. Facts are neutral in exactly the way values are not, and neutrality is what makes them adoptable across a line of rivalry that no shared purpose could cross.

Look at how this has actually happened, because the precedents are not thin. The clearest is the European Union, which is the standing proof that many states, with divergent interests and a long history of conflict, can reach a shared and checkable regime by agreement and mutual recognition rather than by anyone conceding sovereignty to a benevolent center. The single market did not require the members to share aims; it required them to share standards. CE marking lets a product made under one state’s inspection be sold in all of them, because the mark refers to a checkable conformity, not to mutual affection between governments. The standardization bodies — CEN, CENELEC, ETSI — exist precisely to turn contested questions of “is this safe, is this interoperable” into written specifications a party in any member state can hold and verify. GDPR made a checkable obligation portable across borders. The AI Act is the same instinct reaching toward this very domain: defined obligations, declared in advance, attached to risk tiers, recorded in a public register. The EU is not a story about rivals learning to trust each other. It is a story about rivals agreeing on instruments of verification so that they would not have to.

And it is not only Europe, and not only states. The pattern recurs wherever rivals had to coordinate and could not coordinate on values. ICANN holds the internet’s name system through multistakeholder governance precisely because no single government could be trusted with it and fragmentation would have broken the network for everyone. ISO and IEC turn the question “does this conform” into documents that a buyer in one country and a seller in another can both check against, without either trusting the other’s word. ICAO made civil aviation safe enough to cross every border on earth by converging rival aviation authorities on shared, verifiable standards — and the maritime safety regime did the same for the sea. The metric system replaced a continent of incompatible local measures with one checkable reference. Mutual-recognition agreements let one jurisdiction accept another’s conformity assessment because the assessment is a thing you can examine, not a favor you have to trust. Even export-control coordination — among parties who are explicitly adversaries on the very capabilities being controlled — works by converging on shared, checkable lists and definitions. In none of these did the parties first come to share aims. They converged on a common instrument because the instrument was worth more to each of them than going it alone.

The same is true beneath competitive markets, where the parties are not even pretending to cooperate. TCP/IP is a public substrate that every firm builds private products on top of; the firms compete fiercely above it precisely because they do not compete on it. The shipping container is a standard so plain it looks invisible, and it reorganized world trade because a box that any port, any ship, any rival carrier can handle is worth more to all of them than a proprietary box would be to any one of them. USB did the same for devices. ISO 20022 and interbank messaging let competing banks settle with one another over a shared, checkable format — banks, of all institutions, which trust one another exactly as far as the audit allows. In every case the shared standard is not the place competition happens. It is the neutral floor beneath competition, adopted not from fellow-feeling but because a common, verifiable substrate is simply worth more than fragmentation to each competitor weighing it alone.

This is why I think an open evidence layer for machines that act can spread the way those did — and why the concrete lever is duller and more reliable than persuasion. The lever is procurement and contract. Governments buy robots and AI systems; governments write the terms they buy under. A government can require, as a condition of purchase, that a deployed system speak an open, replayable evidence protocol — that it emit signed conformance a third party can check. That is not a prediction about what any government wants. It is a mechanism that works the moment a contract is signed, because a procurement requirement does not need the seller’s goodwill; it needs the seller’s signature on a purchase order. And procurement requirements are the most contagious standards there are: once one large buyer requires conformance, suppliers build it in by default, and the next buyer inherits a market where it is already the norm. Multi-government agreements can make that default explicit across borders the way mutual-recognition agreements already do — a coordination mechanism that asks no party to read another’s intentions, only to sign the same checkable terms.

I keep returning to SAE J3016, the autonomy taxonomy with the L0-through-L5 levels everyone now quotes, because its history shows what the absence of such a shared instrument costs. It was published in January 2014. A major self-driving program had been running since 2009. So for five years the most consequential robotics effort of its era ran without a shared vocabulary for what “the car is driving” even meant, and the semantic gap was not harmless. NHTSA’s EA22-002 named a critical safety gap tied in part to that ambiguity, and people died inside it. The lesson I draw is the opposite of the intuitive one. Standards that define things precisely, before the ecosystem fragments, enable faster development, not slower — they stop everyone from inventing private, incompatible meanings for “supervised” and “safe,” and a private meaning of “safe” is how the worst failures arrive wearing the language of caution. The window for robotics is still open. It will not stay open. The conventions that will govern machines that act are being set now, in this narrow stretch before the field hardens around whatever defaults it happens to adopt, and the cost of adopting the wrong ones is not measured in slower releases. It has, before, been measured in lives.

There is one more reason the standard must be neutral, and it is the same reason I refused at the outset to build the case on anyone’s intentions. An evidence layer is itself an instrument of power. If a government can compel an AI company to offer an unrestricted API, then the safety constraints at the runtime layer should not be removable by changing an API key — those are different layers and they should fail independently. I will be honest about the limit: an open protocol does not stop a determined state actor. Architecture is not sovereignty. But architecture shapes what is possible, and more to the point, what is visible. Mandatory audit trails and protocol-level access control raise the cost of quietly removing a constraint, for everyone, because removal stops being silent. The deepest protection a shared, verifiable layer offers is not that it makes the wrong act impossible. It is that it makes the wrong act legible — and a power that must act in the open, under a standard its rivals also hold, is already a different and more answerable power than one that can act in the dark. That is exactly why the standard cannot belong to any one of the parties it constrains. A checkable fact is neutral only so long as the instrument of checking is neutral. The moment the instrument answers to one rival, it stops being a fact rivals can agree on and becomes one rival’s word again.

So the case for adoption is not a case about who will want this. It is a case about what rivals have always been able to agree on even when they could agree on nothing else: a shared instrument for checking a claim. That is the most durable form of cooperation there is, precisely because it asks for the least — no shared values, no mutual trust, only a common way to verify. Treaties hold between competitors on exactly this basis. Standards hold between firms on exactly this basis. An evidence layer for machines that act can hold on the same basis, and on no weaker one.

Five Things to Codify, and Why Each One

Everything to here has been diagnosis and design. The recall was discretionary. The structure it sits inside rewards opacity. The fix is an evidence layer a third party can check, adopted the way checkable standards have always been adopted among rivals. So here is where I stop describing the problem and say what I am actually asking governments to do — and why each ask is a response to a failure that has already happened, not a hypothetical to guard against.

The headline is short: a government can have the EU AI Act’s predictability without its bureaucracy. The EU got one thing right and one thing wrong. It got right that obligations should be defined in advance and scaled to consequence, recorded in a register anyone can read. It got wrong that you need a licensing apparatus and a roster of designated certifiers to make that work. You do not. If the facts are signed and replayable, you can have defined obligations and a public register without anyone in a government office deciding who is allowed to ship. The five proposals below are the version of that I drafted, with the full reasoning, at rcan.dev/policy.

1. Codify transparent intervention standards. The Fable 5 order arrived as a verbal notice citing a narrow jailbreak. No published trigger, no written basis, no channel to contest it. That is the thing to fix first, because it is the thing that makes everything downstream unpredictable. Replace discretionary directives with published evidence thresholds — the specific, written condition that authorizes an intervention — and a due-process channel for the party on the receiving end. If a deployment is going to be stopped, the order should say what fact stopped it, and the operator should be able to answer that fact on the record. This maps directly to what was missing in June: trigger, basis, process. None of it is exotic. It is the ordinary shape of administrative action — the shape every other exercise of state power was long ago required to take — applied at last to a domain that skipped the requirement.

2. Accountability at the deployment layer. A narrow jailbreak is a property of a deployment, not of model weights. Pulling Fable 5 and Mythos 5 wholesale was a recall aimed at the wrong altitude — punishing a capability for the failure of a configuration. The proposal is to hold specific deployments accountable through verifiable guardrails instead of broad model recalls, and the guardrails I mean are not policy promises but structural facts. RCAN already provisions them: model attribution on every action, so you know which model decided and at what confidence; HMAC-SHA256 append-only chains, so the log of what happened cannot be quietly edited after the fact; structural confidence and human-authorization gates that run before dispatch, not as an after-action review; and AI output watermarking. The point is that a regulator can scope an intervention to the deployment that actually failed, and check the claim, rather than removing a capable model from everyone because of one operator’s configuration.

3. Safe harbor for transparency. This one is the direct fix for the structural problem the recall sits inside. Under a regime where documenting a model’s limitations hands a regulator the exact language to justify a ban, while saying nothing looks cleaner on paper, the published incentive runs toward silence — and silence is the opposite of what anyone wanted. So: providers who disclose limitations and maintain verifiable guardrails get procedural protection. Disclosure plus a signed, replayable evidence trail buys you a defined process instead of a surprise. You make honesty the cheaper option — and a society’s institutions are, in the end, just the running total of which behaviors it has chosen to make cheap. A lab should be able to tell the truth about what its system cannot do without that sentence becoming the basis for pulling it.

4. Risk-based tiering for embodied AI. This is the predictability-without-bureaucracy mechanism, made concrete for machines that act. Obligations should be graduated and mapped to consequence — an L1 deployment and an L4 deployment do not carry the same duties, and they should not. RCAN’s conformance levels, L1 through L4, give you that ladder: low-consequence agents carry light obligations, a robot that can put a human in a hospital carries heavy ones, and the tier is declared, not negotiated case by case in a verbal phone call. An operator can read the tier and know exactly what is required before building — which is the entire value the EU framework offers, minus the office that hands out permission slips.

5. Recognize open registries. The last one is institutional, and it is the one that decides whether any of the others can stay honest. Do not build a closed government system to hold this; recognize neutral, vendor-independent registry infrastructure instead. The Robot Registry Foundation already maps to the EU AI Act — the first five endpoints cover FRIA, the Safety Benchmark, the IFU, Incidents, and the EU Register — with per-robot and per-model identity, Ed25519-signed registrations, and replayable attestation bundles. The line that matters: conformance is self-asserted and independently replayable, not third-party certification. Nobody waits for a certifier. An operator asserts conformance, signs it, and anyone — a regulator, an insurer, a small offline model beholden to no one — can replay the signed facts and check whether the assertion holds. That is the difference between a register you trust because of who runs it and one you trust because you can verify it yourself. The first kind concentrates power in the operator of the register. The second kind dissolves it into the facts. It is also the kind of standard rivals can converge on, for the reason the last section gave: it asks them to share an instrument of verification, not a set of values.

That is the whole ask. Five proposals, each pointed at a failure that already happened, none of them requiring a new licensing regime or a black box or anyone’s permission to ship. Just facts a third party can check.

The Offer, and Why I Will Not Keep It

To regulators, frontier labs, and robotics builders: RCAN and the Robot Registry Foundation can be the neutral, open evidence layer a fair process runs on. No licensing regime, no black box, no vendor lock-in. Just facts a third party can check.

And I want to be plain about what I intend to do with it, because the intention is part of the argument and not an afterthought to it.

I will donate the RCAN and RRF codebases to a neutral steward — to Anthropic, or to another neutral steward positioned to advance transparency and safety — and hand over the governance with them. A public-safety evidence layer for machines should be neutral, legible, signed, and accountable, and it should be owned by no single vendor. That includes me. I built most of this stack solo, by pairing with Claude Code, with real hardware in the loop: a SO-ARM101 arm named bob, registered as RRN-000000000001. That is enough to prove the thing works. It is not enough to make me the right long-term owner of a standard that other people’s safety depends on. The whole argument of this essay is that a layer meant to make power answerable cannot itself answer to a single party — and I am a single party. To hold onto it would be to contradict, with my own hands, the thing I have spent the essay claiming.

I am aware of the tension in naming Anthropic as a possible steward, and I want to meet it head-on rather than paper over it, because the resolution is the whole point. I have argued that Anthropic is the clear leader in frontier models, a concentrated and dual-use power, and that precisely for that reason no governance can rest on its good faith. Both things are true at once, and they do not conflict, because a steward in this design is not asked to be trusted. The steward holds a layer whose entire property is that it can be checked without the steward — replayable from signed facts, verifiable by the least-powerful party in the room, governed by a board on which no single constituency holds the vote. That is exactly why it is safe to name a powerful party as a candidate to hold it: the design is built so that holding it confers no power over the facts. A steward you would have to trust would be disqualified by its capability. A steward of a layer no one has to trust is not. The neutrality has to be in the structure, not in the steward — which is the same thing I have said about every other piece of this.

This is not a whim and not a change of heart announced on the way out. The neutrality is already structural, and structural is the only kind that survives the temptation to keep what you have built. The spec is CC BY 4.0. The SDKs are MIT and Apache-2.0 — licenses chosen precisely because they cannot be revoked by a future version of me with different incentives. I keep the open standard deliberately separate from my commercial work: OpenCastor is the open-source, lite version of the stack, and a separate commercial product is my business. The two are not the same thing and I do not let them blur, because the moment the open standard becomes a feeder for the commercial product, the registry has quietly answered to a vendor again, and we are back to the captured registry that is worse than no registry at all. And the precedent is already on the table, not hypothetical: robot-md, the ROBOT.md reference implementation, is Apache-2.0 with governance open, and I have already offered to transfer that repository to a neutral home. The donation I am describing here is the same move, made larger and made explicit.

I know how this reads if you have watched standards efforts before. One person, convinced the problem is real, holding a thing nobody else has asked for yet. That is usually how it goes — and it is worth saying that this is how nearly every durable open standard has begun, with one person or one small group convinced the problem was real before anyone else was, maintaining a list that was correct long before it was legitimate. ICANN began as a hand-kept list. The list was correct before it was an institution. The reason it had to become an institution anyway is the entire point I have been making: a list maintained by one party is not yet legitimate, no matter how correct it is, because legitimacy is not correctness. Legitimacy is the standing ability of everyone affected to check, and to remove, the party who keeps the record. A correct record held by a single owner is still held at that owner’s discretion, and discretion is the precise thing this whole layer exists to abolish. Handing it off is not generosity. It is the last structural requirement the design imposes, the one that falls on the builder.

So the governance home should look like the problem it governs. I have proposed a board of five seats: a major manufacturer, an AI provider — the seat I would expect Anthropic to hold — an academic safety researcher, a standards professional, and a regulator. No single constituency holds the vote. The codebase comes with that structure, not in place of it — because a registry whose governance answers to one constituency is a registry whose facts answer to one constituency, and that is the failure mode the whole edifice is built to prevent. The board is itself a small instance of the larger argument: rivals who agree on nothing else can still hold a common instrument of verification, provided no one of them holds the others’ hands behind their backs.

Here is what I am actually asking, by audience.

To regulators: point policy at the five proposals at rcan.dev/policy, and recognize open registries as a legible source of facts. You do not need a licensing bureaucracy to get the EU AI Act’s predictability — the risk tiers, the defined obligations, the public register. You need a place where the evidence lives and a way to check it that does not require trusting the operator’s word. The lever you already hold is procurement: require conformance to an open evidence layer in what you buy, and coordinate that requirement with other governments the way mutual-recognition agreements already coordinate conformity assessment. That is a standard rivals can converge on, because it asks them to share an instrument, not a purpose.

To frontier labs: adopt the safe-harbor posture, and help steward the layer. A regime in which candor about a model’s limits invites a ban is a regime whose structure runs against the very disclosure a trust-based framework depends on. If the process runs on signed, replayable facts, then saying clearly what your system cannot do becomes a defense rather than a confession. Anthropic is the clearest case of why this matters and the clearest candidate to help hold the result: the most capable party in the field is exactly the party whose accountability must not depend on its own good faith — and a layer no one has to trust is precisely the layer such a party can safely help steward, provided no one of you owns it, which is the entire reason it must go to a steward whose holding of it confers no power over the facts.

To offline and edge-model builders: register your robots, run the conformance suite, and speak the protocol. Conformance is self-asserted and independently replayable — not a certificate I sell you. You are the natural verifiers in this design: a small, cheap, offline model, beholden to no operator, can re-check any deployment from the signed chain. The honest reason to adopt it is the dull, durable one, the same one that drove every shared standard before it: it is worth more to each of you when more of you speak it. A signed audit chain that only your fleet produces proves less than one a stranger can read. The value of a shared record is precisely that it is shared; a private dialect of accountability accounts to no one.

The throughline under all of it is the same thing I have cared about since before any of this had a name. The layer that makes governance fair has to be checkable by the least-powerful party in the room. I am a deaf engineer, and I have spent a long time building so that the person with the least leverage — the one who cannot hear the room, who was not in the meeting, who does not own the cloud — can still verify what happened from the record. That is why the verifier does not have to be the frontier model that acted. A small, cheap, offline model can re-check any deployment from the signed chain. It just needs the facts. And the deepest reason this can be adopted at all is that facts ask so little of those who adopt them: not shared values, not mutual trust, not anyone’s good intentions — only a common instrument of verification, which is the one thing rivals have always been able to agree on.

We are at one of those rare moments when the conventions for a new kind of actor are still soft, still being set, and a person who builds the right thing at the right time can change the default that hardens. Machines that act are entering the world faster than the apparatus that would make them answerable, and the gap between the two is the place we are all standing now — the same place the self-driving car stood for the five years it ran without a word for what it was doing. We made every prior form of power answerable by forcing it to leave a record that someone other than its wielder could read, and we made that record stick across borders and between rivals by agreeing not on aims but on how to check a claim. We are deciding, right now, in this short window, whether the machines we build to act will be the first form of consequential power in the modern world to escape that requirement, or whether they will be held to it like everything else. I do not think it is a close question. I think it is only a question of whether we choose before the default chooses for us.

That is the offer. The facts are already there to check, the code is already shipping, and the full position and the five proposals are at rcan.dev/policy.

Receipts

What a third party can check

Every row is independently verifiable from signed facts — not a claim you have to take on my word. The whole argument of the essay is that this column should exist for any AI that acts.

WhatHow you check it
Reference robotbob — SO-ARM101 6-DOF arm, Raspberry Pi 5 + Hailo-8 (26 TOPS on-device), registered RRN-000000000001, conformance tier L5.
Tamper-evident chainAppend-only audit chain; chain_hash = HMAC-SHA256(chain_secret, prev_chain_hash || payload_hash); the 32-byte secret lives in memory, never written to the log.
Offline verifierpython3 bin/judge_verify_local.py — needs only python3, cryptography, the proof JSON and a public key; exits 0 on 3 valid actuations + 1 correct HTTP 403 refusal; catches a forged proof.
Live multi-arm incidentMar 15, 16:52 PDT — bob + Alex (RRN-000000000005); STS3215 voltage fault; Alex unscripted switched stack → push/slide; 87% conformance, the 13% gaps itemized in the attestation.
Replayed conformanceIn-browser gateway matches published robot-md-gateway 0.5.0a3 / rcan 3.4.1 / robot-md 1.10.4 — 8/8 decisions, 4/4 confidence, 9/9 trust-lifecycle; canonical JSON byte-identical across Python and TypeScript on 12/12 fixtures.
Managed-agent fleetClaudeFarms / FieldOps — an Opus-4.8 coordinator fanning out to a drone + two rovers on Haiku-4.5; no actuation without a farmer-approved, host-signed, registry-verifiable RCAN order.
Hybrid signingpqc-hybrid-v1 = Ed25519 + ML-DSA-65 (FIPS 204); both halves must verify — against harvest-now-decrypt-later on identities that outlive their assumptions.
ESTOP priorityEmergency stop bypasses every authorization check at every layer; no token — owner, creator, or cloud operator — can gate it.

The offer

A public-safety evidence layer should be owned by no single vendor — including me.

I’ll donate the RCAN and RRF codebases to Anthropic, or to another neutral steward positioned to advance transparency and safety. No licensing regime, no black box, no vendor lock-in — just facts a third party can check. It’s already shipping.

Read the position, or check the work