Free tool

LLM API cost calculator

Tell us how much you send and receive. We will estimate your monthly cost across every model, ranked from cheapest to most expensive.

Input tokens per request

Prompt + system + context. ~¾ word per token.

Output tokens per request

What the model generates back.

Requests per month Assume prompt caching (input cached)

Reuses repeated context at the cached input rate where a provider offers it.

Quick presets

Cheapest model for this workload

—

vs a flagship (GPT-5.4)

—

potential monthly saving

Model	Provider	Per request	Monthly cost

List prices per 1M tokens, verified 2026-06-26. Estimates exclude image/audio tokens, batch discounts and free tiers.

How to estimate your LLM API cost

Every token-based API bills you twice: once for the input (everything you send — the prompt, system message and any context or documents) and once for the output (what the model writes back). Output is usually 2–5× more expensive than input, so the single biggest lever on your bill is how much the model generates.

The formula is simple:

monthly cost = requests × ((input_tokens × input_price) + (output_tokens × output_price)) / 1,000,000

Cap output length. Setting a sensible max_tokens stops runaway generations from blowing the budget.
Use prompt caching. If you resend the same system prompt or document every call, caching can cut that input cost by 75–90%. Toggle it above to see the effect.
Right-size the model. A cheap model that succeeds is better than an expensive one — but a cheap model that fails and forces a retry costs more. Measure success rate, not just sticker price.
Batch when you can. Most providers offer a Batch API at roughly half price for non-urgent jobs.

Want the per-model numbers without the math? See the full LLM API price comparison or the cheapest LLM APIs guide.