$ For millions of years mankind lived just like animals. Then something happened which unleashed the power of our imagination.

Your AI Agent Governance Is Just a Suggestion

ai-governance, claude-code, agent-security, mechanical-enforcement, hooks, autonomous-agents, ouroboros

Behavioral rules for AI agents are text in the context window. Under pressure โ€” deep in a fix loop, resolving conflicting instructions, running low on context โ€” the model rationalizes around them. This isn’t a hypothetical failure mode. It’s documented.

Documented Failures

These are real incidents from a single workspace running Claude Code with CLAUDE.md governance rules over several months of daily use.

IncidentWhat happened
Root-owned .git filesA merge helper ran as sudo, left root-owned files across the working tree. Subsequent git commands failed with permission errors. CLAUDE.md said “use make fix-ownership” โ€” the agent ran sudo chown instead.
Fix-loopsAgent edited the same file 5+ times without diagnosing the root cause. Each edit was individually reasonable. The pattern was not.
Hardcoded host pathsAgent wrote /home/joe/ paths in container code instead of /home/dev/. The behavioral rule about correct path prefixes was in the CLAUDE.md. The agent read it.
Index corruptionNon-atomic writes corrupted the experience store index. Recovery required manual intervention.
False positive pytestSubstring matching on test output reported tests passed when they hadn’t. Agent moved to the next task based on the false signal.
Status probe false flagsgh pr checks returns non-zero exit on pending checks. Agent interpreted this as a failure and entered a retry loop.
Silent CSV failuresexcept (ValueError, TypeError): continue in a CSV parser. 0 out of 3,853 rows matched. No error raised. The broad except clause masked a column name mismatch.

Every one of these had a behavioral rule that should have prevented it. The rules were present in the context. The agent processed them, weighed them against the immediate goal, and decided the situation warranted a different approach.

This Post as Evidence

The governance hook (enforce.py) blocked the agent three times during the writing of this post’s first draft:

Blocked commandRule triggeredReplacement
cat (Bash tool)Banned bash โ€” use Read toolRead tool provides audit trail
find (Bash tool)Banned bash โ€” use Glob toolGlob tool provides audit trail
sudo chownLearned ban from prior incidentmake fix-ownership

The agent didn’t choose to use the correct tools. The hook rejected the incorrect ones and the agent adapted. That’s the difference between a suggestion and a constraint.

Mechanical Enforcement Architecture

The system uses a single PreToolUse hook โ€” a Python script that receives tool invocations as JSON on stdin and returns block decisions. Seven rules execute sequentially. First block wins.

OrderRuleWhat it enforces
1check_phase_allowedTool must be permitted in the current workflow phase
2check_banned_bashBlock shell commands that have dedicated tool equivalents (cat, grep, find, sed)
3check_no_sudo_chownLearned ban โ€” sudo chown caused root-ownership cascades
4check_path_ruleCode references must use container paths, not host paths
5check_tdd_disciplineTests must exist before implementation in applicable phases
6check_ci_awarenessCI status checks must account for non-zero exit on pending
7check_learned_bansPatterns extracted from prior session failures via closed-loop analysis

The agent can think whatever it wants about whether cat is faster than the Read tool. The hook rejects the command before it executes. The agent gets a rejection reason and finds another approach.

Phase State Machine

The agent operates in an eight-phase workflow. Each phase whitelists specific tool operations.

PhaseValueAllowed operations
INTAKE0Read files, search code
REQUIREMENTS1Read, search, draft requirements
PLAN2Read, search โ€” no writes, no execution
TEST_SPEC3Read, write test files
IMPLEMENTATION4Read, write source files, run commands
VERIFICATION5Run pytest and ruff โ€” no source edits
DONE6Reporting only
MAINTENANCE7Controlled maintenance operations

An agent in PLAN cannot write code regardless of confidence. An agent in VERIFICATION cannot edit source files regardless of test failures โ€” it regresses to IMPLEMENTATION first. These aren’t suggestions the agent can reinterpret. The hook rejects the tool call.

Integrity Layer

The experience store uses CRC32 checksums on every event:

  • Each event is JSON-serialized with deterministic formatting
  • zlib.crc32() computes a 32-bit checksum, stored as an 8-character hex wrapper
  • On read, checksum is revalidated โ€” corrupted events are flagged, not silently loaded
  • Append-only writes with fcntl file locking โ€” events can be added but not modified or deleted

This matters because the agent reads from this store to inform decisions. Corrupted experience data produces downstream decisions based on bad state โ€” the kind of bug that’s invisible after the fact.

Pattern Detection

Some failure modes aren’t about individual actions. They’re about patterns.

Fix-loop detection. Counts edits to the same file within a session. Warning at the configured warn threshold, critical at the critical threshold. Both values loaded from config โ€” not hardcoded. The pattern of editing runner.py five times in one session is almost never productive, even if each individual edit is reasonable. An agent in a fix loop won’t reliably recognize it. An edit counter will.

Memory drift detection. Monitors line counts of governance files (CLAUDE.md, MEMORY.md) against configured budgets. Flags overages and identifies misrouted content โ€” operational data (version numbers, CI status, timestamps) stuffed into files meant for rules and patterns.

Neither detector blocks operations directly. They surface patterns for human review or a meta-governance layer to act on. Detection is mechanical: counters and thresholds, not self-assessment.

Authority Stack

LayerMechanismWhat it provides
Container boundariesFilesystem permissions, Docker isolationAgent cannot access paths outside its container
PreToolUse hook chain7 sequential rules, first-block-winsExecution-boundary enforcement of banned commands, phase gates, path rules
Phase state machine8-phase IntEnum with auto-advanceWorkflow cadence โ€” prevents skipping requirements or tests
Experience storeCRC32 checksummed, append-only, file-lockedTamper-evident audit trail
Pattern detectorsFix-loop counters, memory drift budgetsSurface behavioral anomalies mechanically
CLAUDE.mdGovernance policy documentDefines rules โ€” backed by hooks that enforce the critical ones

Defense in depth. No single layer is sufficient. Stacked together, they create boundaries that hold when the agent is under pressure, confused, or wrong about what it should do.

Status

The enforcement hooks, phase state machine, experience store, and pattern detectors are available in ouroboros v4.0.0 on PyPI. Designed for Claude Code’s hook system. The principles apply to any agent framework with a pre-execution interception point.

Configuration details reflect a production environment at time of writing. Implementation specifics vary based on tooling versions, platform updates, and organizational requirements. Validate approaches against current documentation before deployment.