loop-engineering

Loop Engineering Is More Than a Concept — I Built a Cross-Session State Handoff System

Loop Engineering is more than a concept. I built Session Baton — a cross-session state handoff system with three-tier handoff, Anti-Ouroboros gate, and one-line PyPI install.

江中喬

17 6月 2026 • 3 min read

Boris Cherny (Claude Code lead) said: "I don't prompt Claude anymore. My job is to write loops."

This blew up in the AI coding world. Addy Osmani systematized it into five building blocks. Everyone's talking about Loop Engineering.

But most discussions stay at the concept level. I spent a week building it.

The Problem: Your AI Sessions Are Flat

Most people use AI coding agents like this:

Session 1: prompt → work → done
Session 2: prompt → work → done (doesn't remember Session 1)
Session 3: prompt → work → done (doesn't remember Session 1 or 2)

Every session starts from scratch. You manually re-explain context. The agent forgets what mistakes it made, what decisions were taken, what's pending.

"But what about CLAUDE.md and session memory?"

Sure. But those are fuzzy-search knowledge (warm tier). You search "last deployment" and get a three-month-old record. What you actually need is precise state: "I deployed X last session — is it still alive?"

Not a Cron Job — It's a Loop

Adding a timer to run AI on schedule is a cron job. A loop is fundamentally different:

	Cron Job	Loop
State	No memory	Knows what happened last time
Decision	Does the same thing every time	Decides based on last result
Verification	None	Tracks effect of last action
Escalation	None	Calls human only when stuck

The output of each loop iteration becomes the input of the next. It spirals upward, not repeats flatly.

Three-Tier Handoff: Impact, Decision, Experience

While designing the system, I discovered that cross-session handoff involves three fundamentally different lifecycles:

Impact — Short-lived
"I deployed X, expected the health endpoint to return 200."
Next session auto-verifies. Pass → close. Fail → escalate.

Decision — Medium-lived
"Switched to --update-env-vars because --set-env-vars clears existing variables (root cause of last week's production incident)."
Carries rationale + evidence. Lives until superseded.

Experience — Long-lived
"Third time assuming Docker architecture without reading docker-compose.yml."
Single occurrence = anecdote. Repeated occurrences = pattern. At threshold → propose upgrade to enforced rule.

Anti-Ouroboros: LLM Cannot Self-Promote Its Own Rules

Here's a critical governance question: pattern-to-rule graduation means LLM output influences future LLM behavior.

Without a gate, this becomes a self-reinforcing loop — the LLM observes "I keep making this mistake" → auto-writes a rule → the rule changes all future behavior. Sounds efficient, but could permanently encode incorrect observations.

So I added an Anti-Ouroboros Gate:

Every baton item carries a source_tier: llm_derived or human_confirmed
llm_derived patterns cannot auto-graduate to rules
Human must confirm (changing tier to human_confirmed) before writing to rules/
This is part of the ACA (Agent Civilization Architecture) protocol

The Most Overlooked Loop: Your Daily Conversations

When most people hear Loop Engineering, they think "which cron job can I upgrade?"

But the most powerful loop isn't an automation script — it's your daily working conversations with AI.

Think about it:

Session start: load last session's decisions and progress
Working: write code, solve problems, hit walls
Session end: distill lessons into rules
Next session: constrained by the rules from the previous round

This is a meta-loop — a loop that rewrites its own rules.

The loop Osmani describes modifies code. A meta-loop modifies the rules that modify code. The former is efficiency. The latter is compounding capability.

Session Baton: Open Source Implementation

I open-sourced the system: github.com/MakiDevelop/session-baton

Published on PyPI, one-line install:

pip install session-baton
python -m session_baton
# Server runs on http://127.0.0.1:9101

106 lines of Python. FastAPI + SQLite. Runs standalone or plugs into any existing memory system (memhall, mem0, Letta compatible).

Read baton at session start, write at session end. Full schema, skill templates, and spec included in the repo.

Industry Status

After scouting X/Twitter, GitHub, HN, Reddit, and arXiv:

Compounding Knowledge Loop (MindStudio): dumps summary at session end. No action-outcome verification.
bmo (ngrok): cross-session telemetry. Tracks tool success rates, not expected outcomes.
Stability Tier (arXiv): three-tier memory architecture. Academic paper, no engineering implementation.
Meta HyperAgents: failure → rule proposal pipeline. Requires human review, no automatic threshold.

Cross-session closed-loop verification + pattern-to-rule graduation + three-tier structured handoff — no public end-to-end implementation exists yet.

Session Baton is the first.

What's Next

Session Baton just started dogfooding. Over the next few weeks we'll observe:

Does the baton actually spiral upward? Or is it just another log nobody reads?
Is pattern-to-rule graduation practical in real use?
How many "broke something last time but didn't notice" cases does action-outcome verification catch?

Will share findings as they come.

Session Baton: chiba.tw/baton
PyPI: pypi.org/project/session-baton
ACA Protocol: chiba.tw/acap
Loop Engineering knowledge base: chiba.tw/loop-engineering/

I'm Maki (Chung-Chiao Chiang), an AI systems builder with roots in TPM and product management. I design multi-agent collaboration architectures and built ACA — the first open protocol for AI agent governance.