When an agent makes a mistake, the instinct is to add more instructions. More rules. More examples. More context. This is backwards.
The output is wrong. A convention is missed. A file gets clobbered. Something breaks.
A new rule appears in the instruction file. "Never do X." "Always check Y." The motivation is reasonable.
Every token of instruction is a token unavailable for actual work. The agent has less room to think.
The agent makes a new error. The human adds more rules. The file grows. Both sides work harder and achieve less.
Each iteration makes the instruction file larger, the agent less focused, and the output worse. This loop is the default trajectory of every project that manages context through accumulation rather than curation.
A 2026 ETH Zurich study evaluated 138 task instances across 12 repositories. LLM-generated instruction files reduced agent success rates by 3% and increased costs by over 20%.
Developer-written instructions fared better — but still carried a significant cost penalty. The pattern is consistent: untargeted upfront context is expensive at best and harmful at worst.
Even well-written context hurts when it is loaded indiscriminately.
AI labs are pushing two-million-token windows and near-instant prompt caching. The technical constraint is relaxing. The human constraint is not.
A human cannot effectively govern, debug, or curate a two-million-token instruction file. When the agent hallucinates inside that file, the human has no way to determine which of the thousands of loaded instructions contributed.
Bigger windows without better structure just make the failure harder to diagnose.
AI agents reason within a finite context window. Every token of instruction is a token unavailable for the actual work — reading code, understanding state, solving the problem.
Overload the window with instructions and the agent has less room to think. The capability is still there. It is being diluted — not by reducing what the agent can do, but by forcing it to spend its capacity navigating irrelevant context instead of reasoning about the problem.
Most projects are running their agents at a deficit before the first line of code is read.
The lean approach reverses the degradation loop. Structure replaces accumulation. Curation replaces fear.
The Framework →