Romulus: Building an AI system that remembers.
From a persistent personal assistant to a five-cohort command structure — a case study in what one engineer can do when context never resets.
Who Is Romulus
He came online February 3, 2026. Named after the legend of Rome's founding — not as a gimmick, but to clarify what I was trying to build: something with a name, an identity, and a purpose, rather than another stateless chat window.
The problem was straightforward. I work as a senior PM by day and build my own products by night. The gap between "interesting idea" and "shipped prototype" was filled with re-explained context, lost threads, and morning sessions where I had to brief myself on whatever I'd figured out the week before.
Romulus started as an experiment in persistent AI memory: a system that reads from a shared knowledge base (an Obsidian vault) on every session and writes back after. First use case: a morning brief delivered to a private Discord channel at 8 AM — weather, subway timings, product signals from the market. Nothing glamorous. Just a baseline to verify the system was actually running.
The Throughput Wall
A decade of shipping between day jobs taught me the same thing at every startup: ideas aren't the bottleneck. It's throughput — the time between seeing a problem clearly and having a working prototype.
The current generation of AI tools has a specific weakness: they're stateless. Every new conversation starts from zero. Context that took a session to build evaporates the moment it's over. For solopreneurs working in stolen hours, this means constant overhead just to stay aligned with their own thinking.
Design Decisions
Memory
83 files. 12 core memory documents. 29 project notes. 31 daily logs. 11 key decisions.
Chat history is ephemeral. Obsidian files are not. The system retrieves relevant past work before every session — comparing context, flagging contradictions, loading project state. After 69 days, the knowledge graph compounds. What was an empty vault is now a living map of 400+ tasks, shipped products, failures, and lessons.
The system stores context in two places, by design. Obsidian holds the human-readable layer: project notes, daily logs, decisions — things Mike can open, edit, and reason about at any time. SQLite holds the machine-readable layer: 36 entities, 33 relationships, task histories, build reports — structured data that Caesar's workers load as context before they start. The Obsidian vault provides qualitative memory (what happened, decisions made, lessons learned). The SQLite graph provides quantitative memory (relationships, dependencies, outcomes). Together they form a knowledge base that's both human-inspectable and machine-actionable.
What Broke — Day 11
BUGThe morning brief shipped with stop ID R16. The feed returned plausible-looking data — 12-minute travel time from Prospect Ave to Canal St. It looked right.
R16 is Times Square. We were tracking the wrong station entirely for 11 days.
Fix: Downloaded the full static GTFS dataset, cross-referenced stops.txt, found R34N (Prospect Ave northbound) and R23N (Canal St northbound). Patched the script. Correct travel time: 26 minutes.
The Extension: Legion
Running one AI with persistent memory was useful. But it still required one person to initiate each step: research, spec, build, deploy, analyze. The bottleneck shifted from context to bandwidth.
Legion extends the architecture from one system to five — each specialized in a different function, sharing the same memory layer, orchestrated through a single routing system. The model: decompose a prompt, route it to the right cohort, chain the outputs together automatically.
Caesar — intelligence. Research, competitive analysis, market signals.
Augustus — execution. Architecture review, build, and deployment.
Vespasian — revenue. Pricing models, unit economics, exit scenarios.
Trajan — distribution. GTM strategy, content planning, launch sequencing.
Romulus — orchestrator. Task decomposition, routing, quality gates.
A critical mechanic: every transition between cohorts is gated. Caesar's research brief doesn't automatically flow to Augustus — it gets reviewed first. Augustus doesn't deploy without a review cycle. Vespasian doesn't run unless the build succeeded. These gates aren't manual approvals — they're programmed checkpoints that validate outputs against structured criteria before the next phase triggers. This is what distinguishes Legion from a chained prompt: each stage has its own quality bar, and the chain breaks if a stage fails.
Before: you → one tool, sequential.
After: you → five cohorts, parallel.
router
gate-enforcer
synthesizer
analyst
competitor
synthesizer
monitor
builder
reviewer
deployer
unit-economist
exit-modeler
content-writer
partnerships
The Pipeline
Caesar — Research Speed
Memory Compounding
The Synthesizer Fix
VOIDBREAKER — One-Shot Build
What Broke — Legion Era
TIMEOUTAugustus uses Claude Code CLI to build native Swift apps. The sandbox has a 15-minute timeout. Scaffolding a real iOS project with SwiftUI, Speech framework, and AI integration takes 20+ minutes. Architectural limitation, not a fixable bug. Workaround: generate key files directly and scaffold manually.
Discord limits self-bots. The cohort bots can't reliably @mention each other for handoffs. Workaround: database-driven event routing with polling.
The memory alias matcher for project notes was too loose. "Asteroid game" matched "party game" via the keyword "game," loading Forbidden.md as context for an unrelated build. Fix: word-boundary regex and expanded stop words.
What's Next
The current version works for research and build workflows. The next phase is connecting the pipeline to live business data and automating distribution.
What We've Learned
The system works because it remembers. Everything else — the cohorts, the pipeline, the routing — is infrastructure built on top of persistent memory. Without the Obsidian vault as a shared knowledge base, none of this would be possible. The lesson: if your AI system doesn't have memory, it doesn't have a foundation.
The most important design decisions were about what to restrict, not what to enable. Limiting each cohort to its specific question. Restricting synthesizers to KEY FINDINGS only. Keeping humans in the approval loop for build specs. Constraints make agents productive.
The system has been running for 69 days. It ships on schedule, runs every morning, and knows every project I've touched. The next test is scaling it — not in complexity, but in impact. Can this architecture support multiple concurrent projects? Can it connect directly to business metrics? Those are the questions for the next phase.
What We Shipped
These projects either shipped through the Legion pipeline or were built alongside Romulus. All of them benefited from persistent memory and the research-to-build loop.