AI Agent Architecture: Why Single Prompt Files Fail at Scale

This is Part 3 of the Agent Memory Architecture series. We know we need high Context Coverage (Part 1), and we know it needs to be machine-readable Agent RAM (Part 2). But how do you actually write 25,000 lines of rules without over-engineering?

Building large software projects with AI agents requires a shift from upfront design to failure-driven codification. A recent study of a 108,000-line C# codebase—a real-time multiplayer simulation built using MonoGame and Arch ECS—proved that single manifest files fail to scale. Using Claude Code as the sole code generator across 283 sessions, developers achieved higher precision not by writing massive prompts upfront, but by codifying rules and specialized agents strictly in response to system failures. In this workflow, documentation acts as load-bearing infrastructure rather than passive reference material.

Stop trying to write the perfect system prompt on day one. It does not work.

Most developers start AI projects the same way. They dump every convention, rule, and architectural decision into a single manifest file. They write a massive rules document and expect the agent to read it perfectly every time.

Then the project hits 10,000 lines. The agent starts forgetting things. It hallucinates database structures. It ignores your formatting rules.

Evidence: The 108,000-line wall

Single-file context management fails because LLM attention degrades as the prompt grows. Researchers behind the Codified Context study found that maintaining a 108,000-line system required moving away from monolithic instructions toward a distributed three-tier memory architecture.

The researchers utilized 19 specialized agents and 34 distinct specification documents to manage the codebase across 283 sessions. The core insight: they designed zero of these components upfront.

Every new agent and every new specification was a reaction to a specific failure during the build process. When the context window became too noisy, they decoupled the logic.

From TDD to Failure-Driven Codification

The AI era requires a transition from Test-Driven Development (TDD) to Failure-Driven Codification (FDC). While TDD focuses on passing functional tests, FDC focuses on hardening the context that guides the agent.

Every time an agent fails, it reveals a hole in your context structure. When an agent hallucinates a variable, it is not a random glitch. It is a missing piece of infrastructure.

Instead of yelling at the model or manually fixing the code, you codify the failure. You write a specific rule or create a specialized agent. This ensures the mistake never requires re-explanation. You are building a permanent fix into the agent’s operating environment.

How to implement Failure-Driven Codification

If you use tools like Cursor, Windsurf, or Claude Code, you can start building this way immediately.

Burn the master manifest. A single monolithic instruction file fails at scale. Break your context into domain-specific markdown files. Keep one for database schema, one for UI components, and one for API routing. Pass these files to the agent only when they are relevant to the task.

Treat bugs as missing documentation. When an agent fails, you do not just fix the code. You commit the code fix and the memory update simultaneously.

Here is what a Failure-Driven Codification commit looks like in practice:

# 1. The Code Fix (src/Systems/MovementSystem.cs)
- public float speed = 5.0f; // WRONG: State stored in System class
+ // State properly moved to MovementComponent
 
# 2. The Context Codification (context/warm/ecs-expert.md)
@@ -14,2 +14,3 @@
  - Systems must iterate over entities with specific components.
+ - FAILURE AVERT: Systems must be 100% stateless. Never store variables like `speed` in the System class. Always use Components.

Spawn specialized agents for recurring tasks. If you keep reminding the model how to format an API payload, stop. Write a prompt template exclusively for that API. Turn it into a dedicated agent or a specific command alias.

We usually treat documentation as a passive record of what we built. In an agentic workflow, documentation is load-bearing infrastructure. It is the actual operating system your agents run on.

Do not over-engineer it before you start. Let the failures tell you what needs to be written down.

Evidence: The 108,000-line wall

From TDD to Failure-Driven Codification

How to implement Failure-Driven Codification

Related posts

AI Context Coverage: The 25% Rule for Large Codebases

The Agentic Web: Why AI Agents Need a Resume (Arscontexta Skill Graph)

Multi-Agent System Design: Moving From Prompts to Contracts

The Design Decisions You're Making Without Knowing It: What Academics Found Inside Claude Code