Prompting

Context rot: why 1M tokens isn't 1M useful tokens

The longer a Claude Code session runs, the less the context stays useful. Understanding the degradation, measuring the dumb zone, and using directed /compact and rewind instead of corrections.

Publié le 26 avril 2026

The problem: a context that rots

You've had a Claude Code session open for three hours. 600,000 tokens accumulated. Claude starts forgetting conventions you mentioned at the start, suggests solutions already tried and rejected, mixes up files between two features. You restart. Magic: it's lucid again.

That's context rot. The term was popularized by Thariq (Claude Code team) and Dex (MLOps). The idea: model quality doesn't degrade linearly with context size. It collapses in stages, in a zone the community calls the dumb zone.

The dumb zone, in numbers

On the 1M token window (Opus 4.7), observed useful quality varies by profile:

User profile / task	Dumb zone starts around
Claude Code beginner, vague prompts	30 % of context filled
Experienced user, structured prompts	60 % of context filled
Simple task (single file edit)	60 %
Complex task (multi-file refactor)	30 to 40 %

On the standard 200K window, translate: the dumb zone hits between 60K and 150K tokens. Boris Cherny said it publicly: "Claude loses precision well before the hard limit. The advertised window isn't the usable window."

Three symptoms to spot it

Before Claude starts hallucinating, it sends you signals:

It suggests a solution already rejected. You said no 100,000 tokens ago. It forgot.
It mixes up files. It imports a function from another module for no reason, or cites a variable name that doesn't exist.
It loops on a detail. It rephrases the same answer over and over instead of moving forward.

If you see two of those three signals within an hour, your context has entered the dumb zone.

Why automatic `/compact` isn't enough

Claude Code automatically compresses context when it nears the cap. The summary it generates on its own is generic: it preserves open files, the latest diff, the latest question. It throws away half of your business reasoning.

Directed /compact does much better:

/compact I'm starting phase 3 of the auth refactor. Keep only:
1. The API contract defined at session start
2. The 3 architectural decisions validated (JWT, refresh token, rotation)
3. The list of files to modify in phase 3
Forget all the failed attempts from phases 1 and 2.

You give Claude a spec of what to remember. The summary that comes out is five times more useful.

Rewind beats correct

Boris formalized a rule that changes everything: when Claude goes the wrong way, don't correct, rewind.

Concretely: double-Esc to go back to the previous turn and rephrase your initial prompt, instead of adding "no, do it like this instead". Correcting leaves the failed attempt in the context, plus your frustration, plus the new instruction. Three turns wasted. Rewind erases the wrong path, as if it never happened.

Approach	Token cost	Context quality
Correct: "No, do X instead of Y"	+800 to +2000 tokens, 3 messages in history	Polluted by the wrong path
Rewind (double-Esc + rephrase)	0 tokens added, history stays clean	As if the error never happened

Rewind doesn't erase your work. It erases a failed attempt. Your commits, files, file tree: untouched.

Test-time compute: one agent creates the bug, another finds it

Boris shared a surprising pattern: to find your own bugs, separate the contexts. One agent codes the feature, another reviews it. Same model, different contexts.

Why does it work? The first agent is anchored in its reasoning, decisions, biases. The second arrives blank, doesn't have the first one's "conviction", and sees what's wrong. It's cheaper than asking the first one to re-read itself.

# 1. Code the feature in one session
/agent code-implementer add rate limiting to the auth middleware

# 2. Review in ANOTHER session, with only the diff as context
claude --new-session
/agent code-reviewer review only the diff of commit abc123

The second agent sees the diff without all the saga that led to it. That's exactly what makes it effective.

Myth: "agentic search > RAG"

To handle massive codebases, many teams tried to index their code in vector databases to "augment" Claude's context. Anthropic tried, then dropped it.

The reason, per Boris: code drifts faster than the index. Permissions are complex (who can see what). And mostly, Claude is better when it actively searches the repo (with grep, find, ls) than when you serve it vectorized chunks. Agentic search keeps context control; RAG dilutes it.

Practical consequence: let Claude explore. Point it to the right folders via CLAUDE.md. Don't try to "load everything" at session start.

Anti-rot habits

A few reflexes that delay the dumb zone:

Start each session with a clear goal. One session = one mission. No zigzag between features.
Compact proactively with a directed hint at 50 % fill.
Prefer several short sessions over a marathon session. A new session is a fresh brain.
For tasks > 2 hours, use Plan Mode: the plan stays compact even after 50 subtasks.
No copy-paste of full files in the prompt if Claude can read them with Read. Read is lazy, copy-paste eats input context.
Use /rewind on every wrong path. No additive correction.

Next steps

Context management and 200K window for the basics of context handling
Power Tips for Esc Esc, /rewind, ultrathink shortcuts
Extended Thinking and Plan Mode for long sessions
Multi-agent orchestration to structure test-time compute

Prompting for non-developers