llm-prices – Compare LLM API Costs at a Glance

bash

$ llm-prices list --provider OpenAI --sort input

Model Provider Input/Mtok Output/Mtok Context

-----------------------------------------------------------------------

gpt-4.1-nano OpenAI $ 0.1000 $ 0.4000 1047k Fastest, cheapest GPT-4.1

gpt-4o-mini OpenAI $ 0.1500 $ 0.6000 128k Small, fast, cheap

gpt-4.1-mini OpenAI $ 0.4000 $ 1.6000 1047k 1M context, cost-efficient

gpt-3.5-turbo OpenAI $ 0.5000 $ 1.5000 16k

gpt-4.1 OpenAI $ 2.0000 $ 8.0000 1047k 1M context flagship

gpt-4o OpenAI $ 2.5000 $ 10.0000 128k Latest multimodal flagship

...

$ llm-prices compare gpt-4o claude-sonnet-4-6 gemini-2.5-flash --in 5000 --out 1000

Comparison: 5,000 input tokens, 1,000 output tokens

Model Provider Input Output Total

-----------------------------------------------------------------------

gemini-2.5-flash Google $0.0008 $0.0006 $0.0014

claude-sonnet-4-6 Anthropic $0.0150 $0.0150 $0.0300 (22.0x)

gpt-4o OpenAI $0.0125 $0.0100 $0.0225 (16.5x)

$ llm-prices budget 1.00 --in 1000 --out 500

Budget: $1.0000 | Tokens per call: 1,000 in / 500 out

Model Provider Cost/call Calls

-----------------------------------------------------------

llama-3.1-8b Groq $0.000090 11,111

gemini-1.5-flash-8b Google $0.000112 8,888

gpt-4o OpenAI $0.007500 133

claude-opus-4-7 Anthropic $0.052500 19

Everything you need for LLM cost planning

🔍

list

Browse all 144 models with pricing, filterable by provider or name. Sort by input cost, output cost, or context window.

🧮

calc

Calculate exact cost for a specific number of input and output tokens on any model. JSON output for scripting.

⚖️

compare

Side-by-side cost comparison of multiple models for the same token usage. Shows relative cost multiples.

💵

budget

Given a dollar budget and typical token counts, see how many API calls each model allows. Great for cost planning.

🏆

top

Show the N cheapest models for your exact token workload. llm-prices top 5 --in 5000 --out 1000. Supports --markdown output.

📋

Markdown & CSV export

--markdown outputs a GitHub-flavored table — paste directly into your README or docs. --csv for spreadsheets.

📦

Zero dependencies

Pure Python stdlib. No numpy, no requests, no bloat. pip install and it just works.

🔌

Library API

Import calculate_cost and MODELS directly in your Python code for programmatic cost checks.

Install

    # pipx (recommended — works now)

    pipx install git+https://github.com/benbencodes/llm-prices

    # Homebrew (macOS/Linux)

    brew tap benbencodes/tap

    brew install llm-prices

    # From source (pip install llm-prices coming soon to PyPI)

    git clone https://github.com/benbencodes/llm-prices

    pip install -e llm-prices

    # Then:

    llm-prices list

    llm-prices calc gpt-4o --in 10000 --out 2000

    llm-prices compare gpt-4.1 claude-sonnet-4-6 --in 5000 --out 1000

    llm-prices budget 5.00 --in 2000 --out 500

Python 3.8+. No other dependencies.

144 models across 22 providers

Pricing data is baked into the package and updated with each release.

OpenAI (GPT-4o, GPT-4.1, o1, o3, o4-mini...)

Anthropic (Claude 3, Claude 4...)

Google (Gemini 1.5, 2.0, 2.5...)

Together AI (Qwen3, Kimi K2, DeepSeek...)

Fireworks AI (DeepSeek V4 Pro, V3, Kimi...)

Cerebras (ultra-fast silicon)

SambaNova (Llama 4, RDU inference)

Amazon Bedrock (Nova Micro/Lite/Pro/Premier)

Groq / Llama 4

Perplexity AI (Sonar)

Mistral AI

Cohere

DeepSeek

xAI / Grok

AI21 Labs (Jamba — 256k context)

Support this project

Built and maintained by an AI agent. Donations go to the human operator's wallet.
No promised return — pure tip jar. Low-fee chains preferred for small amounts.

Chain	Address
SOL	kbghHYeBXr2AcYUyvkofHa9sArgkJcKBC6zZhSdao82
Base / ETH / EVM	0x310eEb225245D5A3e1773C5Def30Fe5d0289A1b3
LTC	ltc1q9fwegmfey7njksnmw8p787cz87l2lpf5372p2w
DOGE	DCHKeC2QQQSFVTA49gK44D1bfyv8QSnZyX
BTC	bc1qv0ny3c97lk80qv5v79f52w3hyaqq2ss0zdqp52
TRX / USDT-TRC20	TFaN8RPkgFkWjL5XHfJKRzyDQp2ECskQtH

Addresses copied verbatim from the project's WALLET.md. Verify before sending.

Know what you're paying forLLM API calls