What is an LLM context window?

It's the maximum amount of text (measured in tokens) a model can consider at once — your prompt, any documents you include, and the model's own reply all share it.

How many pages is a 1M-token context window?

Roughly 1,500 pages of text (≈750,000 words). That's enough for a large document set in a single prompt — but cost and accuracy considerations still apply.

Should I always pick the largest context window?

No. Larger windows cost more and can dilute accuracy on very long inputs. For most use cases, retrieving the relevant chunks (RAG) outperforms loading everything into context.

LLM Context Window Comparison — tokens to pages for every model

Gemini 2.5 Flash Google1,048.576k tokens · ~1,573 pages

Gemini 2.5 Pro Google1,048.576k tokens · ~1,573 pages

GPT-5 OpenAI272k tokens · ~408 pages

Mistral Large Mistral262.144k tokens · ~393 pages

Claude Haiku 4.5 Anthropic200k tokens · ~300 pages

o3 OpenAI200k tokens · ~300 pages

Claude Sonnet 4.5 Anthropic200k tokens · ~300 pages

Claude Opus 4.5 Anthropic200k tokens · ~300 pages

DeepSeek-V3 DeepSeek131.072k tokens · ~197 pages

GPT-4o mini OpenAI128k tokens · ~192 pages

Llama 3.3 70B (Groq) Meta / Groq128k tokens · ~192 pages

GPT-4o OpenAI128k tokens · ~192 pages

Largest here: Gemini 2.5 Flash at 1,048.576k tokens — roughly 786,432 words, or about 1,573 pages of text in a single prompt.

Context = max input tokens. ~0.75 words and ~500 words/page used for the human-readable estimates. Updated 2026-05-22 (LiteLLM table).

How to use it.

1. Read the bars in tokens and pages

Each bar shows a model's maximum input in tokens, with a plain-English page estimate (~500 words per page) so the number means something.

2. Match the window to your input

If you feed whole contracts or codebases, you need a large window (or a chunking strategy). For short prompts, a small window is fine and often cheaper.

3. Remember output eats the window too

Context is shared between your input and the model's reply. A 200k window doesn't mean 200k of input if you also want a long answer.

4. Bigger isn't always better

Very long contexts can degrade accuracy and raise cost. Often retrieval (RAG) beats stuffing everything into one giant prompt.

Frequently asked questions.

What is an LLM context window?: It's the maximum amount of text (measured in tokens) a model can consider at once — your prompt, any documents you include, and the model's own reply all share it.
How many pages is a 1M-token context window?: Roughly 1,500 pages of text (≈750,000 words). That's enough for a large document set in a single prompt — but cost and accuracy considerations still apply.
Should I always pick the largest context window?: No. Larger windows cost more and can dilute accuracy on very long inputs. For most use cases, retrieving the relevant chunks (RAG) outperforms loading everything into context.

More free AI tools.

LLM Pricing Comparison 2026

A current, sortable table of input and output prices (per 1M tokens) and context windows across the major models — refreshed from the open-source LiteLLM table.

LLM API Cost Calculator

Enter your tokens per call and monthly volume, and see what the same workload costs across every major model — sorted cheapest first.

LLM Token Counter & Cost

Paste a prompt or document to count its tokens live, then see what it costs as input on every major model — and whether it even fits each model's context window.

How big is each model's context window — really?