How are LLM API costs calculated?

You pay per token, separately for input (your prompt) and output (the model's reply). Cost = (input tokens ÷ 1M × input price) + (output tokens ÷ 1M × output price). Output tokens are usually 3–5× more expensive than input.

Why is one model so much cheaper than another for the same task?

Per-token prices vary by over 50× between flagship and small models. For high-volume, simple tasks a small model can do the job at a fraction of the cost — which is why comparing before you commit matters.

Are these prices current?

They're snapshotted from the open-source LiteLLM model table and refreshed regularly. Always confirm against the provider's own pricing page before making billing commitments.

LLM API Cost Calculator — compare GPT, Claude & Gemini pricing

Input tokens per call

Output tokens per call

Calls per month

Cheapest (GPT-4o mini) is 37× cheaper than dearest (Claude Opus 4.5) for this workload.

Model	Per call	Per month
GPT-4o mini OpenAIcheapest	$0.270	$0.270
DeepSeek-V3 DeepSeek	$0.364	$0.364
Llama 3.3 70B (Groq) Meta / Groq	$0.748	$0.748
Gemini 2.5 Flash Google	$0.800	$0.800
Mistral Large Mistral	$0.800	$0.800
Claude Haiku 4.5 Anthropic	$2	$2
GPT-5 OpenAI	$3.25	$3.25
Gemini 2.5 Pro Google	$3.25	$3.25
o3 OpenAI	$3.6	$3.6
GPT-4o OpenAI	$4.5	$4.5
Claude Sonnet 4.5 Anthropic	$6	$6
Claude Opus 4.5 Anthropic	$10	$10

Prices per 1M tokens, USD, from the LiteLLM model table (updated 2026-05-22). Estimates only — verify against provider pricing before billing decisions.

How to use it.

1. Estimate your tokens per call

Input tokens are your prompt + any context you send; output tokens are the model's reply. Rough rule: 1,000 tokens ≈ 750 words. Use the token counter if you want an exact figure from real text.

2. Enter your monthly call volume

How many times the workflow runs per month. The table multiplies per-call cost by this to show your real monthly bill.

3. Compare across models

The table prices the identical workload on every model and sorts cheapest first. The cheapest-vs-dearest ratio at the top is usually the surprise — often 8–50× for the same task.

4. Don't just pick the cheapest

Match the model to the job: cheap models are fine for classification and extraction; reasoning-heavy tasks may justify a pricier one. The calculator shows cost; capability is the other half of the decision.

Frequently asked questions.

How are LLM API costs calculated?: You pay per token, separately for input (your prompt) and output (the model's reply). Cost = (input tokens ÷ 1M × input price) + (output tokens ÷ 1M × output price). Output tokens are usually 3–5× more expensive than input.
Why is one model so much cheaper than another for the same task?: Per-token prices vary by over 50× between flagship and small models. For high-volume, simple tasks a small model can do the job at a fraction of the cost — which is why comparing before you commit matters.
Are these prices current?: They're snapshotted from the open-source LiteLLM model table and refreshed regularly. Always confirm against the provider's own pricing page before making billing commitments.

More free AI tools.

LLM Token Counter & Cost

Paste a prompt or document to count its tokens live, then see what it costs as input on every major model — and whether it even fits each model's context window.

LLM Pricing Comparison 2026

A current, sortable table of input and output prices (per 1M tokens) and context windows across the major models — refreshed from the open-source LiteLLM table.

Context Window Comparison

Compare every major model's context window — in tokens and in plain pages of text — so you can see what actually fits in a single prompt.

What will this LLM workload actually cost?