Claude Context Window Explained — How Much Can Claude Remember? (2026)

One of the most common questions people have about Claude is: "How much can it remember?" The answer involves something called the context window. This guide explains what it is, how it affects your experience, and how to work with it effectively.

What Is a Context Window?

A context window is the total amount of text Claude can process in a single conversation turn. It includes everything: the system prompt, the entire conversation history (every message from you and every response from Claude), any documents or files you have shared, and Claude's current response.

Think of it as Claude's working memory. Everything that fits in the context window, Claude can see and reference. Anything outside it might as well not exist for that particular interaction.

The context window is measured in tokens, which are the fundamental units language models work with.

What Are Tokens?

A token is roughly three-quarters of a word in English. Some common rules of thumb:

1 token is approximately 4 characters
100 tokens is approximately 75 words
1,000 tokens is approximately 750 words
A typical page of text is about 300-400 tokens

So when we say a model has a 200,000-token context window, that is roughly 150,000 words — about the length of two full novels.

Tokens are not just words, though. Punctuation, spaces, code syntax, and special characters all consume tokens. Code tends to use more tokens per concept than prose because of all the brackets, semicolons, and variable names.

Context Windows by Claude Model (2026)

Here are the current context window sizes for Claude's models:

Claude Opus 4.6

Context window: 200,000 tokens (input)
Extended context: Up to 1,000,000 tokens with the extended thinking and context features
Max output: 32,000 tokens per response
Best for: Complex tasks with large codebases, long documents, and multi-step reasoning

Claude Sonnet 4.6

Context window: 200,000 tokens (input)
Max output: 16,000 tokens per response
Best for: Most everyday tasks with a good balance of capability and speed

Claude Haiku 4.5

Context window: 200,000 tokens (input)
Max output: 8,000 tokens per response
Best for: Quick tasks where speed matters more than depth

All three models share the same base 200K context window, but they differ in output length and how effectively they use that context.

What Happens When You Hit the Limit?

When your conversation approaches the context window limit, one of several things happens depending on the interface:

On claude.ai: Claude warns you that the conversation is getting long. You may need to start a new conversation. In some cases, older messages may be summarized to free up space.

On the API: The API returns an error if your input exceeds the context limit. You need to manage the conversation history yourself — trimming older messages or summarizing them.

In Claude Code: Claude Code manages context intelligently. It does not load your entire codebase into context at once. Instead, it reads files as needed, keeps relevant information, and lets go of files it no longer needs.

Why Context Windows Matter in Practice

For Conversations

In a very long conversation, Claude has access to everything you discussed earlier. This is why Claude can reference something you said 50 messages ago. But if the conversation exceeds the context window, earlier messages fall out of view.

Practical impact: for long research sessions or extended coding sessions, starting a new conversation when the topic shifts helps keep Claude focused and within its context limits.

For Document Analysis

The 200K context window means Claude can process substantial documents in a single pass:

A 100-page report
A full legal contract
Several chapters of a book
A large codebase file

But if your document is extremely large (say, 500 pages), you may need to process it in sections.

For Coding

When using Claude Code, the context window determines how much of your codebase Claude can have in active memory at once. A 200K window holds roughly 50,000-80,000 lines of code — more than enough for most files, but not enough for an entire large project simultaneously.

Claude Code handles this by selectively reading files. It does not try to hold your entire project in memory. Instead, it reads the files relevant to your current task, works with them, and moves on. This is why Claude Code can work on projects of any size even though the context window is finite.

How to Work Effectively With Context Windows

1. Front-Load Important Information

Claude pays strong attention to the beginning and end of the context. If something is critical, make sure it appears early in the conversation or in the system prompt. Do not bury key requirements in the middle of a long message.

2. Be Concise With Inputs

Every token of input reduces the space available for Claude's response and for conversation history. If you are sharing a document for analysis, consider sharing only the relevant sections rather than the entire document.

3. Start Fresh When Needed

If a conversation has gone on for a long time and Claude seems to be losing track of earlier context, start a new conversation. Summarize the key points from the previous conversation in your first message.

4. Use Projects for Persistent Context

On claude.ai, Projects let you attach files and instructions that persist across conversations. This means you do not waste context window space re-explaining your project every time you start a new conversation.

5. Use CLAUDE.md for Coding

With Claude Code, a CLAUDE.md file provides persistent project context without consuming conversation tokens on repetitive instructions.

Tokens and Pricing

If you use Claude through the API, you pay per token — both for input (what you send to Claude) and output (what Claude generates). Understanding tokens helps you estimate costs:

A simple question and answer might use 500 tokens total
Analyzing a 10-page document might use 5,000-8,000 input tokens
A long coding session might use 50,000-100,000 tokens over multiple turns

The context window size is the maximum — you are not charged for unused space, only for the tokens you actually use.

Common Misconceptions

"Claude remembers everything forever"

Claude does not have persistent memory across conversations (unless you use Projects or the memory feature). Each conversation starts fresh. The context window only applies within a single conversation.

"Bigger context window means smarter"

The context window is about how much text Claude can process at once, not how intelligent it is. A model with a 200K context window is not inherently smarter than one with a 100K window — it just can handle more input.

"I should always use the maximum context"

More context is not always better. Sending Claude 200K tokens when your question needs 2K can actually be counterproductive. Relevant, focused context produces better results than dumping everything in.

"Output tokens and input tokens are the same thing"

The context window is primarily about input. Output has its own separate limits (32K for Opus, 16K for Sonnet, 8K for Haiku). Your input and Claude's output together cannot exceed the context window, but the output limits are typically the binding constraint.

Practical Tips

For document analysis: Break very large documents into logical sections and analyze each separately
For coding: Let Claude Code manage file reading rather than pasting entire files into the conversation
For long conversations: Periodically summarize the current state and start fresh
For the API: Implement conversation history management that trims or summarizes older messages

Going Further

Understanding context windows helps you get better results from every interaction with Claude. For more practical tips on working with Claude effectively, check our complete guide, our prompt library for tested patterns, and our cheat sheet for quick reference.

Claude Context Window Explained — How Much Can Claude Remember? (2026)

Get notified when we discover new Claude codes