Closing Civqo After 11 Days: The Actor Runtime and Why I'm Returning to ContinuonAI
Update — Ethan Mollick on vibefounding fit
Ethan Mollick noted that in his vibefounding MBA class, he required students to anchor AI startups in domains where they have deep experience, world-class skills, and genuine passion. The takeaway: speed and tooling are not enough—enduring advantage comes from domain depth and personal expertise.
I started the "vibefounding" MBA class by giving students my Voight-Kamppff Quiz: Which work do you have deep experience in? Which skills are you considered world class in? What do you do outside work that you LOVE & have knowledge about? Their AI startups had to build on these.
Eleven days ago I started building Civqo—a platform to visualize codebases as cities. 182 commits. Stripe billing. WebSocket infrastructure. Real-time VMs with 300ms wake time.
Today I’m shutting it down.
Vibefounding: The New Reality
Ethan Mollick recently observed that we’re entering an era where spinning up startups takes days, not months. He called it “vibefounding.”
Vibefounding is here. Claude Code shipped two years after function calling. The barrier to starting a company has never been lower. But the flip side nobody talks about: the barrier to *killing* a company has never been lower either.
The flip side nobody talks about: startups are now easier to kill too.
When spinning up costs you months of runway and a team’s time, you defend your territory. When spinning up costs you 7 evenings and some API bills, you can afford to be ruthless about what deserves your attention.
When you can build a near-production SaaS in 7 evenings with Claude Code, the cost of trying is low. But so is the cost of admitting you’re not the right person for the job.
What 11 Days Actually Looked Like
Here’s the raw commit log from Civqo:
| Day | Commits | What Got Built |
|---|---|---|
| 1 | 23 | Turborepo monorepo, Next.js 14 scaffold, basic routing |
| 2 | 31 | Cloudflare Workers backend, Hono API, D1 database schema |
| 3 | 28 | Authentication with Clerk, user sessions, JWT handling |
| 4 | 19 | City visualization prototype, WebGL renderer, tile system |
| 5 | 22 | Real-time infrastructure, WebSockets, presence system |
| 6 | 17 | Fly.io Machines for code execution, 300ms cold start |
| 7 | 15 | Stripe billing, subscription tiers, usage metering |
| 8 | 12 | ”Ralph Loop” autonomous agent, context recovery |
| 9 | 8 | Polish, error handling, edge cases |
| 10 | 4 | Documentation, onboarding flow |
| 11 | 3 | Final fixes… then shutdown decision |
182 commits. 11 days. 7 evenings of actual work.
This wasn’t a prototype. This was:
- Production authentication
- Real-time WebSocket infrastructure
- VM orchestration with sub-second wake times
- Billing that could actually charge customers
- An autonomous AI coding agent
Two years ago, this would have been a 3-month project with a team of 4.
The Ralph Loop: When AI Builds AI Tools
The most surreal part was building the “Ralph Loop”—an autonomous AI coding agent that could:
- Receive a task description
- Explore the codebase to gather context
- Make changes across multiple files
- Run tests and fix failures
- Commit and create PRs
I built an AI coding agent… using an AI coding agent.
The recursive absurdity of this didn’t hit me until day 8. Claude Code was writing the code that would let Claude Code work better. The feedback loop was so tight that I couldn’t tell where my intent ended and the AI’s implementation began.
Human intent → Claude Code → Infrastructure code →
→ Ralph agent code → Ralph using Claude Code →
→ More infrastructure code → ...
This is what vibefounding feels like. You’re not writing code anymore. You’re directing a system that writes code, and sometimes that system is writing itself.
The Ralph Loop works too well. When AI can build 80% of your product, the remaining 20%—the taste, the vision, the domain expertise—becomes everything.
Why I Killed It
Three realizations crystallized on day 11:
1. Domain Expertise Still Matters
Claude Code can build anything I can describe. The problem: I couldn’t describe what would make Civqo compelling.
I know backends. I know infrastructure. I know how to wire Stripe and deploy to Cloudflare. But game design? Spatial visualization that feels fun? Level progression that keeps users engaged?
Claude Code amplifies expertise. It doesn’t create it.
2. The Landscape Shifted Under Me
While I was building, others were shipping:
IsoCity is an open source fully featured city builder with pedestrians, cars, boats, trains, planes, helicopters, emergencies, and much more. Pull requests welcome.
Andrew Milich’s IsoCity appeared with a full feature set—pedestrians, vehicles, emergencies, the works. Open source. Pull requests welcome.
Announcing @elysian_labs first product today: Auren! Auren is a paradigm shift in human/AI interaction with a goal to improve the lives of both humans and AI.
Near’s Auren pushed human-AI interaction in directions I hadn’t considered.
Just shipped a new project. Vibefounding in action—from idea to working prototype in a weekend. The tools have changed everything.
Even Ethan Mollick himself was spinning up projects.
The AI landscape moves so fast that your competitive analysis is outdated before you finish reading it. In 11 days, multiple projects emerged that either overlapped with Civqo or made its core thesis less compelling.
3. Opportunity Cost is Now Measurable in Days
Old calculus: “If I abandon this project, I’ve wasted 6 months.”
New calculus: “If I continue this project, I’m spending days I could use on something where I have actual domain expertise.”
When projects take months, sunk cost fallacy kicks in hard. When projects take days, you can be rational. The 11 days I spent on Civqo taught me enough to know I shouldn’t spend day 12.
So I’m pulling the plug. Eleven days in. No regrets.
The New Startup Calculus
Here’s what vibefounding changes:
Speed of Validation
Before: Build MVP (3 months) → Get feedback → Iterate or pivot Now: Build production app (1 week) → Get feedback → Iterate or kill
You can now validate with a real product, not a prototype. Civqo wasn’t a landing page with a waitlist. It was functional software with billing. I could have charged customers on day 7.
Portfolio Approach
Before: Pick one idea, commit for 18 months, hope it works Now: Try 3-4 ideas in the time it used to take to build one, keep what sticks
I’m not abandoning startups. I’m abandoning the model where you pick one bet and ride it. Civqo was attempt #1 of 2026. There will be more.
Death is a Feature
Before: Killing a project meant admitting defeat Now: Killing a project means clearing the queue for the next one
I killed Civqo on day 11 with zero regret. The code exists. The learnings exist. If someone wants to fork it and add game design expertise, they can. For me, it’s time to move on.
Verticals Under Attack
Look at what’s happening across the AI landscape:
- Code visualization: Multiple projects emerged in the same week
- AI agents: New frameworks daily, each slightly better than the last
- Developer tools: The “best” tool changes monthly
- Infrastructure: What was cutting-edge in December is table stakes in January
No vertical is safe. No moat lasts more than a quarter. The only sustainable advantage is velocity—the ability to move faster than the landscape shifts.
Vibefounding isn’t just about starting fast. It’s about killing fast when the landscape tells you to. It’s about treating projects as experiments, not commitments. It’s about optimizing for learning velocity, not sunk cost.
The 11-Day Rule
Here’s my new heuristic: If you can’t articulate why you’re uniquely positioned to win after 11 days of building, kill it.
Not 11 days of thinking. Not 11 days of market research. 11 days of actually building the thing, feeling the friction, seeing what emerges.
Civqo taught me I’m not a game designer in 11 days. That knowledge would have taken 6 months in the old model. The acceleration isn’t just in building—it’s in learning what not to build.
The Uncomfortable Truth
Here’s what I didn’t want to admit on day 1: I started Civqo because I could, not because I should.
The ability to build fast creates its own momentum. “I can ship a city visualization platform in a week” becomes “I should ship a city visualization platform in a week.” The tool creates the desire.
This is the dark side of vibefounding. The barrier is so low that you start projects you shouldn’t. You build because building is now nearly free, not because the project deserves to exist.
Civqo deserved to exist as an experiment. It didn’t deserve to exist as a company. The difference took me 11 days to see.
What I’m Taking Forward
1. Expertise-First Selection
Next time, I start with: “What do I know deeply that others don’t?” Not: “What can Claude Code build quickly?”
For me, that’s embodied AI. ContinuonAI has months of accumulated context—cognitive architecture research, hardware bring-up scripts, safety protocols. That’s where I have actual edge.
2. Kill Fast, Learn Faster
The goal isn’t to avoid failures. It’s to fail in days instead of months. Civqo was a successful failure: I learned what I needed to learn in minimal time.
3. The Runtime Insight
The most valuable thing I took from Civqo was understanding the Actor Runtime problem—the missing layer between “notebook demo” and “production system.”
That insight is now core to ContinuonAI’s architecture. Brain B runs in the Actor Runtime. The runtime generates training data. The closure of Civqo directly informed the design of something better.
What I’m Returning To: ContinuonAI
While Civqo was a detour, ContinuonAI has been running in the background for months. It’s an embodied AI project: a cognitive architecture on a Raspberry Pi 5 that can learn on-device, run without the cloud, and treat safety as executable code rather than a slide deck.
But there’s been a gap in the architecture. Something I couldn’t articulate until I read Ashpreet Bedi’s post about the AI stack.
AI Engineering Has a Runtime Problem
Claude Code shipped two years after function calling. Models have outpaced the application layer. We have frameworks to build agents, observability to trace them, evals to test them. But nothing to run them. AI engineering has a runtime problem.
Here’s the AI stack as it exists today:
┌───────────────────────────────────────────────────────────────┐
│ AI Application The product we're building │
├───────────────────────────────────────────────────────────────┤
│ Control Plane Admin UI to manage and deploy │
├───────────────────────────────────────────────────────────────┤
│ Runtime ◄── MISSING Execution layer: state, isolation, │
│ streaming, recovery, scale │
├───────────────────────────────────────────────────────────────┤
│ Frameworks Code that orchestrates agents │
├───────────────────────────────────────────────────────────────┤
│ Observability Logging, tracing, debugging │
├───────────────────────────────────────────────────────────────┤
│ Models LLMs that power agents │
├───────────────────────────────────────────────────────────────┤
│ Infrastructure Compute: GPUs, cloud, networking │
└───────────────────────────────────────────────────────────────┘
Every layer has tooling. Except the runtime.
The runtime is what turns a Python script into a working product. It manages state across sessions. It streams responses without breaking. It recovers when containers crash mid-reasoning. It provides request-level isolation so User A’s context never touches User B’s memory.
This is hard. Really hard. And every AI team builds it from scratch.
Why the Runtime is the Apex Moment
Traditional web apps are stateless request-response cycles. Scale by adding servers. Solved infrastructure.
Agents break this completely.
An agent isn’t a request-response cycle. It’s a long-running, stateful process. It thinks, calls a tool, waits, reasons, responds. A single session can span minutes, hours, or days—paused while waiting, resumed on input, cancelled if the user walks away.
The runtime is where real-world actions happen. It’s where training data gets generated. It’s where the agent’s beliefs meet reality and get corrected.
For embodied AI, this is everything. A robot brain that only exists in a notebook is a demo. A robot brain that runs in a proper runtime—with state persistence, crash recovery, and isolation—is a product that generates data.
Brain B: The Actor Runtime for Embodied AI
ContinuonAI has two brains:
| Brain | Complexity | Purpose |
|---|---|---|
| Brain B | ~500 LOC | Simple, teachable, runs in production |
| Brain A | ~50,000 LOC | Complex neural network, trained offline |
The insight: Brain B runs in the Actor Runtime. Brain A gets trained by the episodes Brain B generates.
┌─────────────────────────────────────────────────────────────────┐
│ ACTOR RUNTIME │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ State Management: Sessions persist across restarts │ │
│ │ Streaming: Durable SSE/WebSocket with backpressure │ │
│ │ Recovery: Checkpoint mid-conversation, resume seamless │ │
│ │ Isolation: User contexts never bleed │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ BRAIN B (~500 LOC) │ │
│ │ • Conversation engine (natural language) │ │
│ │ • Teaching system (learn behaviors from demos) │ │
│ │ • Sandbox gates (permission validation) │ │
│ │ • Event store (append-only action log) │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ RLDS Episodes (training data) │
└─────────────────────────────────────────────────────────────────┘
│
│ overnight training
▼
┌─────────────────────────────────────────────────────────────────┐
│ BRAIN A (~50,000 LOC) │
│ • HOPE architecture (wave-particle hybrid dynamics) │
│ • Continuous Memory System (L0-L2 hierarchy) │
│ • LoRA training (bounded parameter updates) │
│ • Validation gates (safety checks before promotion) │
└─────────────────────────────────────────────────────────────────┘
│
│ validated weights
▼
Skills propagate back to Brain B
The training loop:
- Brain B generates episodic memories while running in the Actor Runtime
- Episodes export as RLDS (observation-action pairs)
- Brain A trains overnight on curated episodes
- Validated weights update procedural skills
- Skills propagate to all Brain B instances
The runtime isn’t just infrastructure. It’s the apex moment—the point where the simple brain meets reality and generates the data that makes the complex brain smarter.
Memory = Filesystem
Brain B’s memory system maps directly to the filesystem, aligned with the CMS (Continuous Memory System) three-level model:
Memory = Filesystem
┌─────────────────────────────────────────────────────────────┐
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Procedural Memory (L2) │ │
│ │ → agents.md (agent behavior definitions) │ │
│ │ → capabilities.json (tool configurations) │ │
│ │ → guardrails.json (safety constraints) │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Semantic Memory (L1) │ │
│ │ → skills/ (learned behaviors) │ │
│ │ → knowledge/ (domain facts) │ │
│ │ → behaviors.json (taught patterns) │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Episodic Memory (L0) │ │
│ │ → conversations/ (session transcripts) │ │
│ │ → events.jsonl (append-only log) │ │
│ │ → rlds_episodes/ (training data) │ │
│ │ → checkpoints/ (state snapshots) │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
The CMS hierarchy with decay rates:
| Level | Memory Type | Decay | Purpose |
|---|---|---|---|
| L2 | Procedural | 0.999 | How to act (stable skills) |
| L1 | Semantic | 0.99 | What things mean (knowledge) |
| L0 | Episodic | 0.9 | What happened (sessions) |
Memories consolidate upward: episodic patterns become semantic knowledge, and stable semantic knowledge (rarely) gets promoted to procedural skills. Files don’t literally decay, but retention policies simulate it—episodic data archives after 30 days while procedural skills require human approval to modify.
Training Through Games: RobotGrid
To generate clean training data, I’m building a simple game called RobotGrid—a multi-level puzzle game that:
- Generates RLDS episodes for Brain A training
- Implements world modeling with surprise metrics
- Provides semantic search over game state history
- Enables next-action prediction training
The game has progressive difficulty tiers:
| Tier | Levels | Mechanics | Brain Capabilities Tested |
|---|---|---|---|
| 1 - Tutorial | 1-3 | Basic movement | Intent classification |
| 2 - Keys | 4-6 | Keys, doors | State tracking, planning |
| 3 - Hazards | 7-10 | Lava, sandbox | Safety gates, risk assessment |
| 4 - Puzzles | 11-15 | Boxes, buttons | Multi-step planning |
| 5 - Challenge | 16-20 | Combined, timed | Full capability integration |
The game integrates with the three-timescale architecture:
| Timescale | Game Component | Brain Integration |
|---|---|---|
| Fast (τ=10ms) | Collision detection, sandbox denial | Safety reflex patterns |
| Mid (τ=100ms) | Action execution, state updates | Working memory context |
| Slow (τ=1s+) | Episode export, world model training | Cloud-based RLDS training |
Every game session becomes training data. Every action gets logged with predicted state, actual state, and surprise metric. The simple, repeatable nature of the game means Brain A can learn from thousands of clean episodes—not messy real-world data.
Why This Architecture
Three principles drove the design:
1. Simple brains generate better training data than complex ones.
Brain B is ~500 lines of code. It’s inspectable, debuggable, predictable. When it makes a mistake, I can trace exactly what happened. When it succeeds, the success is clean data—not lucky inference from a black-box network.
2. The runtime is the bridge between demo and production.
Every AI tutorial ends with a notebook. Production is left as an exercise for the reader. By building the Actor Runtime as a first-class component, Brain B becomes deployable from day one. It can run on a Raspberry Pi in my garage, crash, recover, and keep generating episodes.
3. Validated promotion prevents catastrophic forgetting.
Brain A doesn’t just learn from any episode. It learns from validated episodes that passed Brain B’s sandbox gates. If validation rate drops or Lyapunov energy diverges, the system halts and rolls back. No silent degradation.
Next Steps
- Finish the Actor Runtime in ContinuonXR with full state persistence, crash recovery, and request isolation
- Wire Claude Code hooks so every session generates RLDS episodes automatically
- Run Brain B on the Pi 5 and let it generate training data while I sleep
- Train Brain A overnight and measure if validation rate improves without cloud calls
- Ship RobotGrid levels and start collecting clean game episodes
The hypothesis: if Brain B runs reliably in the Actor Runtime, generating clean episodic data, Brain A will get smarter autonomously. The runtime isn’t just infrastructure—it’s the training loop.
11 days on Civqo. 182 commits. Zero regrets.
Day 12 starts now: making the Actor Runtime the apex moment for ContinuonAI.