Live pricing for 12 flagship models · sorted cheapest first
Enter your tokens per call and monthly volume, and see what the same workload costs across every major model — sorted cheapest first.
Cheapest (GPT-4o mini) is 37× cheaper than dearest (Claude Opus 4.5) for this workload.
| Model | Per call | Per month |
|---|---|---|
| GPT-4o mini OpenAIcheapest | $0.270 | $0.270 |
| DeepSeek-V3 DeepSeek | $0.364 | $0.364 |
| Llama 3.3 70B (Groq) Meta / Groq | $0.748 | $0.748 |
| Gemini 2.5 Flash Google | $0.800 | $0.800 |
| Mistral Large Mistral | $0.800 | $0.800 |
| Claude Haiku 4.5 Anthropic | $2 | $2 |
| GPT-5 OpenAI | $3.25 | $3.25 |
| Gemini 2.5 Pro Google | $3.25 | $3.25 |
| o3 OpenAI | $3.6 | $3.6 |
| GPT-4o OpenAI | $4.5 | $4.5 |
| Claude Sonnet 4.5 Anthropic | $6 | $6 |
| Claude Opus 4.5 Anthropic | $10 | $10 |
Prices per 1M tokens, USD, from the LiteLLM model table (updated 2026-05-22). Estimates only — verify against provider pricing before billing decisions.
Input tokens are your prompt + any context you send; output tokens are the model's reply. Rough rule: 1,000 tokens ≈ 750 words. Use the token counter if you want an exact figure from real text.
How many times the workflow runs per month. The table multiplies per-call cost by this to show your real monthly bill.
The table prices the identical workload on every model and sorts cheapest first. The cheapest-vs-dearest ratio at the top is usually the surprise — often 8–50× for the same task.
Match the model to the job: cheap models are fine for classification and extraction; reasoning-heavy tasks may justify a pricier one. The calculator shows cost; capability is the other half of the decision.
A free tool gives you a hypothesis. The 30-minute diagnostic is where we pressure-test it against your actual workflows — and decide whether the project is worth building, buying, or skipping.