Free tools/LLM API Cost Calculator

What will this LLM workload actually cost?

Live pricing for 12 flagship models · sorted cheapest first

Enter your tokens per call and monthly volume, and see what the same workload costs across every major model — sorted cheapest first.

Cheapest (GPT-4o mini) is 37× cheaper than dearest (Claude Opus 4.5) for this workload.

ModelPer callPer month
GPT-4o mini OpenAIcheapest$0.270$0.270
DeepSeek-V3 DeepSeek$0.364$0.364
Llama 3.3 70B (Groq) Meta / Groq$0.748$0.748
Gemini 2.5 Flash Google$0.800$0.800
Mistral Large Mistral$0.800$0.800
Claude Haiku 4.5 Anthropic$2$2
GPT-5 OpenAI$3.25$3.25
Gemini 2.5 Pro Google$3.25$3.25
o3 OpenAI$3.6$3.6
GPT-4o OpenAI$4.5$4.5
Claude Sonnet 4.5 Anthropic$6$6
Claude Opus 4.5 Anthropic$10$10

Prices per 1M tokens, USD, from the LiteLLM model table (updated 2026-05-22). Estimates only — verify against provider pricing before billing decisions.

Get this as a branded one-pager

A clean PDF you can save or send to your team. We'll email you occasional, useful AI notes — no spam.

How to use it.

1. Estimate your tokens per call

Input tokens are your prompt + any context you send; output tokens are the model's reply. Rough rule: 1,000 tokens ≈ 750 words. Use the token counter if you want an exact figure from real text.

2. Enter your monthly call volume

How many times the workflow runs per month. The table multiplies per-call cost by this to show your real monthly bill.

3. Compare across models

The table prices the identical workload on every model and sorts cheapest first. The cheapest-vs-dearest ratio at the top is usually the surprise — often 8–50× for the same task.

4. Don't just pick the cheapest

Match the model to the job: cheap models are fine for classification and extraction; reasoning-heavy tasks may justify a pricier one. The calculator shows cost; capability is the other half of the decision.

Frequently asked questions.

How are LLM API costs calculated?
You pay per token, separately for input (your prompt) and output (the model's reply). Cost = (input tokens ÷ 1M × input price) + (output tokens ÷ 1M × output price). Output tokens are usually 3–5× more expensive than input.
Why is one model so much cheaper than another for the same task?
Per-token prices vary by over 50× between flagship and small models. For high-volume, simple tasks a small model can do the job at a fraction of the cost — which is why comparing before you commit matters.
Are these prices current?
They're snapshotted from the open-source LiteLLM model table and refreshed regularly. Always confirm against the provider's own pricing page before making billing commitments.

More free AI tools.

Numbers looking promising?

A free tool gives you a hypothesis. The 30-minute diagnostic is where we pressure-test it against your actual workflows — and decide whether the project is worth building, buying, or skipping.