10. Context

The context window is Claude's "working memory" during a session — everything the model can "see" when generating a response. In current Claude models, it is 200,000 tokens (~500 pages of text). However, in practice, the usable space is considerably smaller.

Why isn't the real limit 200K?

Part of the 200K tokens is consumed by fixed elements that Claude Code loads automatically in every interaction:

Component	Approx. consumption	What it is
Auto-compact buffer	~33K tokens (16.5%)	Fixed reserve for summarization and response generation
System tools	~17K tokens (8.4%)	Internal tool definitions (Read, Edit, Bash, etc.)
Memory files	~7K tokens (3.7%)	Project CLAUDE.md + global + auto-memory
System prompt	~3K tokens (1.3%)	Claude Code internal instructions
Plugins, skills, and agents	~2K+ tokens	Each installed plugin/skill adds overhead

Adding it all up, ~60K tokens (~30%) are consumed before you even type your first prompt. That's why auto-compact triggers when usage reaches ~83% — not 100%. In practice, the actual space for your conversation is approximately ~160-165K tokens.

What is auto-compact? When the context approaches the limit, Claude Code automatically summarizes the previous conversation to free up space. This allows you to continue the session, but older details may be lost during compaction. The Context left until auto-compact: X% indicator in the status line shows how far you are from this happening.

Quality degradation

Even before reaching the limit, the model's response quality starts to decline from ~70% usage of the context window. The longer the conversation, the greater the chance the model "forgets" details from the middle of the session (a phenomenon known as "lost in the middle"). Therefore, consider:

Safe practical limit: ~160K tokens (not 200K)
Ideal working range: try to complete complex sessions before reaching ~130-140K tokens (~65-70%)
From 70% onwards, consider using /compact or starting a new session with /clear

Why isn't the real limit 200K?

Part of the 200K tokens is consumed by fixed elements that Claude Code loads automatically in every interaction:

Component

Approx. consumption

What it is

Auto-compact buffer

~33K tokens (16.5%)

Fixed reserve for summarization and response generation

System tools

~17K tokens (8.4%)

Internal tool definitions (Read, Edit, Bash, etc.)

Memory files

~7K tokens (3.7%)

Project CLAUDE.md + global + auto-memory

System prompt

~3K tokens (1.3%)

Claude Code internal instructions

Plugins, skills, and agents

~2K+ tokens

Each installed plugin/skill adds overhead

Quality degradation

Safe practical limit: ~160K tokens (not 200K)
Ideal working range: try to complete complex sessions before reaching ~130-140K tokens (~65-70%)
From 70% onwards, consider using /compact or starting a new session with /clear

Why isn't the real limit 200K?

Quality degradation

Premium Content

10. Context

Why isn't the real limit 200K?

Quality degradation

Premium Content

#Why isn't the real limit 200K?

#Quality degradation

Premium Content

#Why isn't the real limit 200K?

#Quality degradation

Premium Content

Why isn't the real limit 200K?

Quality degradation

Why isn't the real limit 200K?

Quality degradation