Free tools/LLM Pricing Comparison 2026

LLM pricing, compared — and actually up to date.

Sortable · input/output per 1M tokens · refreshed from LiteLLM

A current, sortable table of input and output prices (per 1M tokens) and context windows across the major models — refreshed from the open-source LiteLLM table.

Model Provider Input / 1M Output / 1M Context
GPT-4o minicheapest inputOpenAI$0.15$0.6128k (~192 pp)
DeepSeek-V3DeepSeek$0.28$0.42131.072k (~197 pp)
Gemini 2.5 FlashGoogle$0.3$2.51,048.576k (~1573 pp)
Mistral LargeMistral$0.5$1.5262.144k (~393 pp)
Llama 3.3 70B (Groq)Meta / Groq$0.59$0.79128k (~192 pp)
Claude Haiku 4.5Anthropic$1$5200k (~300 pp)
GPT-5OpenAI$1.25$10272k (~408 pp)
Gemini 2.5 ProGoogle$1.25$101,048.576k (~1573 pp)
o3OpenAI$2$8200k (~300 pp)
GPT-4oOpenAI$2.5$10128k (~192 pp)
Claude Sonnet 4.5Anthropic$3$15200k (~300 pp)
Claude Opus 4.5Anthropic$5$25200k (~300 pp)

Updated 2026-05-22 from the LiteLLM model table. USD per 1M tokens. Click a column to sort.

How to use it.

1. Sort by what matters to you

Click any column header to sort by input price, output price, or context window. Default sort is cheapest input first.

2. Watch input vs output prices

Output tokens cost several times more than input. If your workload generates long replies, output price dominates your bill — sort by it.

3. Factor in context window

A cheap model with a small context window may force you to chunk documents, adding complexity. The context column (with a rough page estimate) helps you weigh that.

4. Turn it into your number

Pricing tells you the rate; the cost calculator turns it into your actual monthly bill for a given workload.

Frequently asked questions.

Which LLM is cheapest in 2026?
For raw input price, small models like GPT-4o mini, Gemini Flash, and DeepSeek lead. But 'cheapest' depends on your input/output ratio and which model is capable enough for the task — sort the table by the dimension that drives your cost.
Why does output cost more than input?
Generating tokens is more compute-intensive than reading them, so providers price output 3–5× higher. Workloads with long outputs (summaries, drafts) are dominated by output cost.
How often is this updated?
It's snapshotted from the open-source LiteLLM model table and refreshed regularly, so it tracks new model releases and price changes faster than most static comparison posts.

More free AI tools.

Numbers looking promising?

A free tool gives you a hypothesis. The 30-minute diagnostic is where we pressure-test it against your actual workflows — and decide whether the project is worth building, buying, or skipping.