Sortable · input/output per 1M tokens · refreshed from LiteLLM
A current, sortable table of input and output prices (per 1M tokens) and context windows across the major models — refreshed from the open-source LiteLLM table.
| Model | Provider | Input / 1M ↑ | Output / 1M | Context |
|---|---|---|---|---|
| GPT-4o minicheapest input | OpenAI | $0.15 | $0.6 | 128k (~192 pp) |
| DeepSeek-V3 | DeepSeek | $0.28 | $0.42 | 131.072k (~197 pp) |
| Gemini 2.5 Flash | $0.3 | $2.5 | 1,048.576k (~1573 pp) | |
| Mistral Large | Mistral | $0.5 | $1.5 | 262.144k (~393 pp) |
| Llama 3.3 70B (Groq) | Meta / Groq | $0.59 | $0.79 | 128k (~192 pp) |
| Claude Haiku 4.5 | Anthropic | $1 | $5 | 200k (~300 pp) |
| GPT-5 | OpenAI | $1.25 | $10 | 272k (~408 pp) |
| Gemini 2.5 Pro | $1.25 | $10 | 1,048.576k (~1573 pp) | |
| o3 | OpenAI | $2 | $8 | 200k (~300 pp) |
| GPT-4o | OpenAI | $2.5 | $10 | 128k (~192 pp) |
| Claude Sonnet 4.5 | Anthropic | $3 | $15 | 200k (~300 pp) |
| Claude Opus 4.5 | Anthropic | $5 | $25 | 200k (~300 pp) |
Updated 2026-05-22 from the LiteLLM model table. USD per 1M tokens. Click a column to sort.
Click any column header to sort by input price, output price, or context window. Default sort is cheapest input first.
Output tokens cost several times more than input. If your workload generates long replies, output price dominates your bill — sort by it.
A cheap model with a small context window may force you to chunk documents, adding complexity. The context column (with a rough page estimate) helps you weigh that.
Pricing tells you the rate; the cost calculator turns it into your actual monthly bill for a given workload.
A free tool gives you a hypothesis. The 30-minute diagnostic is where we pressure-test it against your actual workflows — and decide whether the project is worth building, buying, or skipping.