LLM API Cost Calculator — Compare GPT, Claude, Gemini
Quick answer: Enter input/output tokens per call and daily call volume. Pick which models to compare. See per-call, daily, monthly, and annual cost for each.
Compare LLM API costs across all major models — GPT-4o, GPT-5, Claude Opus 4, Sonnet 4.5, Haiku 4.5, Gemini 2 Pro, and Flash. Enter your tokens per call and call volume to see the daily, monthly, and annual cost on each model side-by-side. Pricing is current as of 2025.
Advertisement
Last reviewed: April 2026Report an error
Models to compare
Cheapest Model
gemini-2-flash
At your usage, gemini-2-flash costs $7.98/mo. Most expensive: claude-opus-4 at $1,824.00/mo.
| Model | Per Call | Daily | Monthly | Annual |
|---|---|---|---|---|
| gpt-5 | $0.0069 | $6.88 | $209.00 | $2,509.38 |
| gpt-4o | $0.0088 | $8.75 | $266.00 | $3,193.75 |
| gpt-4o-mini | $0.0005 | $0.53 | $15.96 | $191.63 |
| claude-opus-4 | $0.06 | $60.00 | $1,824.00 | $21,900.00 |
| claude-sonnet-4-5 | $0.01 | $12.00 | $364.80 | $4,380.00 |
| claude-haiku-4-5 | $0.0032 | $3.20 | $97.28 | $1,168.00 |
| gemini-2-pro | $0.0044 | $4.38 | $133.00 | $1,596.88 |
| gemini-2-flash | $0.0003 | $0.26 | $7.98 | $95.81 |
Monthly Cost by Model
Sorted cheapest to most expensive. Cheaper models can be 10-100x less than top-tier ones.
Advertisement
How to Use This LLM API Cost Calculator
- 1Enter average input tokens per call (typical: 500-3000).
- 2Enter average output tokens per call (typical: 200-1500).
- 3Enter calls per day (your expected production volume).
- 4Select which models to compare.
- 5Read per-call, daily, monthly, and annual cost for each model.
Frequently Asked Questions
- A token is roughly 3-4 characters of text or about 0.75 of a word in English. "Hello world" is ~2 tokens. The exact tokenization depends on the model.
- Generating output requires the model to make decisions for each token, which is computationally heavy. Input is just read once. Typical ratio: output costs 2-5x more per token than input.
- For routine tasks (summarization, classification, simple chat), Haiku 4.5 or Gemini Flash are 10-20x cheaper than top-tier models with surprisingly competitive quality. For complex reasoning, Opus or GPT-5 still pull ahead.
- Anthropic, OpenAI, and Google all offer prompt caching that reduces input cost by 50-90% for repeated prompt prefixes. If you're sending the same system prompt thousands of times, this can cut bills by 80%+. The calculator does NOT factor caching — assume 50% savings if you use it.
- Yes, but at low volumes API is almost always cheaper than self-hosting. Break-even for self-hosting (Llama 3.1 70B on H100 GPUs) is roughly 100M+ tokens per day. Below that, APIs win on cost AND ops simplicity.
Advertisement
Related Calculators
</> Embed this calculator on your website
<iframe src="https://calqpro.com/calculators/llm-api-cost-calculator" width="100%" height="600" frameborder="0" title="CalQpro Calculator" loading="lazy"></iframe> <p>Powered by <a href="https://calqpro.com">CalQpro</a></p>
Advertisement