The rise of powerful AI models like GPT, Gemini, Claude, Mistral, Llama, and others has opened doors for AI developers, entrepreneurs, and startups. But with so many options available, choosing the right model for your project can feel overwhelming, especially when considering cost.
This post lists the pricing structures of popular AI models, comparing their sizes, context window, and costs per million tokens to help you make informed decisions about your AI projects. We obtain the price information of these AI APIs from their official websites and will update it frequently to maintain its accuracy.
TL;DR
| Model | Context Window | Input/1M Tokens | Output/1M Tokens |
|---|---|---|---|
| GPT-5.1/GPT-5 | 400K | $1.25 | $10.00 |
| Sora 2 | – | – | $0.10/second |
| Claude Sonnet 4.5 | 200K | $3.00 | $15.00 |
| Gemini 3 Pro | 200K | $2.00 | $12.00 |
| Gemini 2.5 Pro | 200K | $1.25 | $10.00 |
| Gemini 2.5 Flash | 1M | $0.15 (text/image/video) $1.00 (audio) | Non-thinking: $0.60 Thinking: $3.50 |
| Grok 4 | 256,000 | $3.00 | $15.00 |
| DeepSeek-V3.2-Exp | 128K | $0.28 | $0.42 |
| Qwen-Max | 262,144 | $1.20 | $6.00 |
| MiniMax M2 | 204,800 | $0.30 | $1.20 |
| Kimi K2 Thinking | 262,144 | $0.60 | $2.50 |
Table Of Contents
- OpenAI GPT Models API Pricing
- OpenAI Reasoning Models API Pricing
- OpenAI Video Generation API Pricing
- OpenAI Image Generation API Pricing
- Claude 4 API Pricing
- Gemini API Pricing
- Gemini 2.5 Flash Native Audio API Pricing
- Gemini 2.5 Flash Image Preview Pricing
- Gemini 2.5 TTS Pricing
- Google Imagen 4 & 3 API Pricing
- Gemini Embedding API Pricing
- Google Veo 3.1 API Pricing
- Grok API Pricing
- DeepSeek API Pricing
- Qwen API Pricing
- Mistral (Premier Models) API Pricing
- Mistral (Open Models) API Pricing
- Llama 4 & 3 API Pricing
- GLM API Pricing
- Kimi K2 API Pricing
- Minimax M2 API Pricing
- Minimax Hailuo API Pricing
- PPLX API Pricing
- Cohere API Pricing
OpenAI GPT Models API Pricing
| Model | Context Window | Input/1M Tokens | Output/1M Tokens |
|---|---|---|---|
| gpt-5.1 | 400K | $1.25 | $10.00 |
| gpt-5 | 400K | $1.25 | $10.00 |
| gpt-5-mini | 400K | $0.25 | $2.00 |
| gpt-5-nano | 400K | $0.05 | $0.40 |
| gpt-5-pro | 400K | $15.00 | $120.00 |
| gpt-5-codex | – | $1.25 | $10.00 |
| gpt-5-codex-mini | – | $1.50 | $6.00 |
| gpt-5-search-api | – | $1.25 | $10.00 |
| gpt-4.1 | 1M | $2.00 | $8.00 |
| gpt-4.1-mini | 1M | $0.40 | $1.60 |
| gpt-4.1-nano | 1M | $0.10 | $0.40 |
| gpt-4o | 128K | $2.50 | $10.00 |
| gpt-4o-mini | 128K | $0.15 | $0.60 |
| gpt-realtime | 128K | $4.00 | $16.00 |
| gpt-realtime-mini | 128K | $0.60 | $2.40 |
| gpt-4o-realtime-preview | 128K | $5.00 | $20.00 |
| gpt-4o-mini-realtime-preview | 128K | $0.60 | $2.40 |
| gpt-4o-audio-preview | 128K | $2.50 | $10.00 |
| gpt-4o-mini-audio-preview | 128K | $0.15 | $0.60 |
OpenAI Reasoning Models API Pricing
| Model | Context Window | Input/1M Tokens | Output/1M Tokens |
|---|---|---|---|
| o4-mini | 200K | $1.10 | $4.40 |
| o4-mini-deep-research | 200K | $2.00 | $8.00 |
| o3-pro | 200K | $20.00 | $80.00 |
| o3 | 200K | $2.00 | $8.00 |
| o3-deep-research | 200K | $10.00 | $40.00 |
| o3-mini | 200K | $1.10 | $4.40 |
| o1-pro | 200K | $150.00 | $600.00 |
| o1 | 200K | $15.00 | $60.00 |
| o1-mini | 128K | $1.10 | $4.40 |
OpenAI Video Generation API Pricing
| Model | Price per second |
|---|---|
| Sora 2 | $0.10 |
| Sora 2 Pro | $0.30 |
| Sora 2 Pro (Portrait: 1024 x 1792. Landscape: 1792 x 1024) | $0.50 |
OpenAI Image Generation API Pricing
| Model | Input/1M Tokens | Output/1M Tokens |
|---|---|---|
| GPT-image-1 | $10.00 | $40.00 |
| GPT-image-1-mini | $2.50 | $8.00 |
Claude 4 API Pricing
| Model | Context Window | Input/1M Tokens | Output/1M Tokens |
|---|---|---|---|
| Claude Opus 4.1 | 200K | $15.00 | $75.00 |
| Claude Sonnet 4.5 | ≤ 200K | $3.00 | $15.00 |
| Claude Sonnet 4.5 | > 200K | $6.00 | $22.50 |
| Claude Haiku 4.5 | 200K | $1.00 | $5.00 |
Gemini API Pricing
| Model | Context Window | Input/1M Tokens | Output/1M Tokens |
|---|---|---|---|
| Gemini 3 Pro | >200K | $4.00 | $18.00 |
| Gemini 3 Pro | 200K | $2.00 | $12.00 |
| Gemini 2.5 Pro | >200K | $2.50 | $15.00 |
| Gemini 2.5 Pro | 200K | $1.25 | $10.00 |
| Gemini 2.5 Flash | 1M | $0.30 (text/image/video) $1.00 (audio) | $2.50 |
| Gemini 2.5 Flash-Lite | 1M | $0.10 (text/image/video) $0.50 (audio) | $0.40 |
| Gemini 2.0 Flash | 1M | $0.10 $0.70 (audio) | $0.40 |
| Gemini 2.0 Flash-Lite | 1M | $0.075 | $0.30 |
| Gemini 1.5 Pro | >128K | $2.50 | $10.00 |
| Gemini 1.5 Pro | 128K | $1.25 | $5.00 |
| Gemini 1.5 Flash | >128K | $0.15 | $0.60 |
| Gemini 1.5 Flash | 128K | $0.075 | $0.30 |
| Gemini 1.5 Flash-8B | >128K | $0.075 | $0.30 |
| Gemini 1.5 Flash-8B | 128K | $0.0375 | $0.15 |
| Gemini 1.0 Pro | 32K | $0.50 | $1.50 |
Gemini 2.5 Flash Native Audio API Pricing
| Model | Free Tier | Input/1M Tokens | Output/1M Tokens |
|---|---|---|---|
| Gemini 2.5 Flash Native Audio | Not available | $0.50 (text) $3.00 (audio / video) | $2.00 (text) $12.00 (audio) |
Gemini 2.5 Flash Image Preview Pricing
| Model | Free Tier | Input/1M Tokens | Output/1M Tokens |
|---|---|---|---|
| Gemini 2.5 Flash Image Preview | Not available | $0.30 (text / image) | $0.039 per image |
Gemini 2.5 TTS Pricing
| Model | Free Tier | Input/1M Tokens | Output/1M Tokens |
|---|---|---|---|
| Gemini 2.5 Flash Preview TTS | Free of charge | $0.50 (text) | $10.00 (audio) |
| Gemini 2.5 Pro Preview TTS | Not available | $1.00 (text) | $20.00 (audio) |
Google Imagen 4 & 3 API Pricing
| Model | Paid Tier, per Image in USD |
|---|---|
| Imagen 4 Fast | $0.02 |
| Imagen 4 Standard | $0.04 |
| Imagen 4 Ultra | $0.06 |
| Imagen 3 | $0.03 |
Gemini Embedding API Pricing
| Model | Paid Tier, per 1M tokens in USD |
|---|---|
| gemini-embedding-001 | $0.15 |
Google Veo 3.1 API Pricing
| Model | Paid Tier, per second in USD |
|---|---|
| Veo 3.1 Standard | $0.40 |
| Veo 3.1 Fast | $0.15 |
Grok API Pricing
| Model | Context Window | Input/1M Tokens | Output/1M Tokens |
|---|---|---|---|
| Grok 4 Fast | 2M | $0.20 | $0.50 |
| Grok 4 | 256,000 | $3.00 | $15.00 |
| grok-code-fast-1 | 256,000 | $0.20 | $1.50 |
| Grok 3 | 131,072 | $3.00 | $15.00 |
| Grok 3 Mini | 131,072 | $0.30 | $0.50 |
| Grok 2 Vision | 32,768 | $2.00 | $10.00 |
| Grok 2 Image | – | 0.07/image | 0.07/image |
DeepSeek API Pricing
| Model | Context Window | Input/1M Tokens | Output/1M Tokens |
|---|---|---|---|
| DeepSeek-V3.2-Exp (Non-thinking) | 128K | $0.28 | $0.42 |
| DeepSeek-V3.2-Exp(Thinking Mode) | 128K | $0.28 | $0.42 |
Qwen API Pricing
| Model | Context Window | Input/1M Tokens | Output/1M Tokens |
|---|---|---|---|
| Qwen-Max | 262,144 | $1.60 | $6.40 |
| Qwen 3 (Plus) | 1,000,000 | $0.40 | $0.12 |
| Qwen 3 (Turbo) | 1,000,000 131,072(Thinking Mode) | $0.05 | $0.20 |
| Qwen-Flash | 1,000,000 | $0.05 | $0.40 |
| Qwen-Coder | 1,000,000 | $0.30 | $1.50 |
Mistral (Premier Models) API Pricing
| Model | Context Window | Input/1M Tokens | Output/1M Tokens |
|---|---|---|---|
| Mistral Large | 128K | $2.00 | $6.00 |
| Pixtral Large | 128K | $2.00 | $6.00 |
| Mistral Saba | 128K | $0.20 | $0.60 |
| Mistral Medium 3 | 128K | $0.40 | $2.00 |
| Magistral Medium | 128K | $2.00 | $5.00 |
| Devstral Medium | 128K | $0.40 | $2.00 |
| Codestral | 32K | $0.30 | $0.90 |
| Document AI & OCR | – | – | OCR: $1/1000 pages Annotations: $3/1000 pages |
| Voxtral Mini Transcribe | – | – | Audio Input/min $0.002 |
| Mistral Embed | 32k | $0.10 | – |
| Mistral Moderation 24.11 | 32k | $0.10 | |
| Magistral Small | 128K | $0.50 | $1.50 |
| Codestral Embed | 128K | $0.15 | – |
Mistral (Open Models) API Pricing
| Model | Context Window | Input/1M Tokens | Output/1M Tokens |
|---|---|---|---|
| Pixtral Large | 128K | $2.00 | $6.00 |
| Pixtral 12B | 128K | $0.15 | $0.15 |
| Mistral Nemo | 128K | $0.15 | $0.15 |
| Mistral Small 3.2 | 128K | $0.10 | $0.30 |
| Magistral Small | 128K | $0.50 | $1.50 |
| Devstral Small | 128K | $0.10 | $0.30 |
| Voxtral Mini | – | $0.001 (audio) $0.04 (text) | $0.04 |
| Voxtral Small | – | $0.001 (audio) $0.04 (text) | $0.03 |
| Ministral 8B 24.10 | 32K | $0.10 | $0.10 |
| Mixtral 8x7B | 32K | $0.70 | $0.70 |
| Mixtral 8x22B | 64K | $2.00 | $6.00 |
Llama 4 & 3 API Pricing
| Model | Context Window | Input/1M Tokens | Output/1M Tokens |
|---|---|---|---|
| Llama 4 Scout | 10M | $0.11 | $0.34 |
| Llama 4 Maverick | 10M | $0.20 | $0.60 |
| Llama 3.3 70B Versatile | 128K | $0.59 | $0.79 |
| Llama 3.3 70B SpecDec | 8192 | $0.59 | $0.99 |
| Llama 3.3 70b Instruct | 128K | $0.23 | $0.40 |
| Llama 3.3 70b Instruct-Turbo | 128K | $0.13 | $0.40 |
| Llama 3.2 90b Vision-Instruct | 128K | $0.35 | $0.40 |
| Llama 3.2 11b Vision-Instruct | 128K | $0.055 | $0.055 |
| Llama 3.1 405B | 128K | $1.79 | $1.79 |
| Llama 3.1 70B | 128K | $0.35 | $0.40 |
| Llama 3.1 8B | 128K | $0.09 | $0.09 |
GLM API Pricing
| Model | Context Window | Input/1M Tokens | Output/1M Tokens |
|---|---|---|---|
| GLM-4.6 | 128K | $0.60 | $2.20 |
| GLM-4.5 | 128K | $0.60 | $2.20 |
| GLM-4.5v | 64K | $0.60 | $1.80 |
| GLM-4.5-X | 128K | $0.45 | $8.90 |
| GLM-4.5-Air | 128K | $0.20 | $1.10 |
| GLM-4.5-AirX | 128K | $1.10 | $4.50 |
| GLM-4.5-Flash | 128K | FREE | FREE |
Kimi K2 API Pricing
| Model | Context Window | Input/1M Tokens | Output/1M Tokens |
|---|---|---|---|
| kimi-k2-thinking | 262,144 | $0.60 | $2.50 |
| kimi-k2-thinking-turbo | 262,144 | $1.15 | $8.00 |
| kimi-k2-0905-preview | 262,144 | $0.60 | $2.50 |
| kimi-k2-0711-preview | 131K | $0.60 | $2.50 |
| kimi-k2-turbo-preview | 262,144 | $1.15 | $8.00 |
Minimax M2 API Pricing
| Model | Context Window | Input/1M Tokens | Output/1M Tokens |
|---|---|---|---|
| Minimax M2 | 204,800 | $0.30 | $1.20 |
Minimax Hailuo API Pricing
| Model | Unit Price |
|---|---|
| MiniMax-Hailuo-2.3-Fast | $0.19 per 768P, 6s video |
| MiniMax-Hailuo-2.3-Fast | $0.32 per 768P, 10s video |
| MiniMax-Hailuo-2.3-Fast | $0.33 per 1080P, 6s video |
| MiniMax-Hailuo-2.3 | $0.28 per 768P, 6s video |
| MiniMax-Hailuo-2.3 | $0.56 per 768P, 10s video |
| MiniMax-Hailuo-2.3 | $0.49 per 1080P, 6s video |
PPLX API Pricing
| Model | Context Window | Input/1M Tokens | Output/1M Tokens |
|---|---|---|---|
| pplx-70b-online | 4K | $1.00 | $1.00 |
| pplx-7b-online | 4K | $0.20 | $0.20 |
Cohere API Pricing
| Model | Context Window | Input/1M Tokens | Output/1M Tokens |
|---|---|---|---|
| Command A | 256K | $2.50 | $10.00 |
| Command R+ | 128K | $2.50 | $10.00 |
| Command R | 128K IN/4K OUT | $0.15 | $0.60 |
| Command R7B | 128K | $0.0375 | $0.15 |
See Also:
- What Is The Max Token Limit In OpenAI ChatGPT
- What Are The Rate Limits For OpenAI API?
- Compare AI Costs: Free LLM API Price Calculator
Changelog:
11/18/2025
- Added Gemini 3 Pro
11/14/2025
- Added GPT-5.1
- Updated Qwen models
- Updated Kimi models
- Added Minimax M2
11/08/2025
- Added codex-mini
10/15/2025
- Update for Claude Haiku 4.5 & Veo 3.1.
10/07/2025
- Added Sora 2 and more OpenAI APIs.
09/29/2025
- Updated for Claude 4.5 and DeepSeek-V3.2
09/19/2025
- Updated price
08/21/2025
- Updated for DeepSeek-V3.1
08/20/2025
- Added GLM-4.5 and Kimi K2 models.
08/14/2025
- Added Imagen 4 Fast
08/07/2025
- Added GPT-5
08/05/2025
- Added Claude Opus 4.1
07/15/2025
- Added Voxtral model family
07/14/2025
- Added Gemini Embedding
07/11/2025
- Added Grok 4
06/26/2025
- Added o3-deep-research and o4-mini-deep-research
06/25/2025
- Added Imagen 4 Ultra and Imagen 4 Standard
06/20/2025
- Added Gemini 2.5 Flash-Lite
06/10/2025
- Added o3-pro
- Added Mistral Magistral models.
06/10/2025
- OpenAI dropped the price of o3 by 80%
05/22/2025
- Updated for Claude 4
05/21/2025
- Updated Mistral models
05/21/2025
- Added Gemini 2.5 Flash Native Audio
05/08/2025
- Added Mistral Medium 3
04/23/2025
- Added OpenAI Image Generation
04/18/2025
- Added Gemini 2.5 Flash
- Updated Cohere models
04/16/2025
- Added o4-mini
04/14/2025
- Added gpt-4.1 family
04/11/2025
- Updated DeepSeek
04/11/2025
- Added Google Imagen 3 and Veo 2.
04/10/2025
- Added Qwen models
- Added Grok 3
04/05/2025
- Added Gemini Pro 2.5
03/19/2025
- Added o1-pro
02/28/2025
- Added GPT-4.5
02/24/2025
- Added Claude 3.7
02/21/2025
- Added Grok
02/06/2025
- Added Gemini 2.0 Flash and Gemini 2.0 Flash-Lite
02/02/2025
- Added o3-mini
01/31/2025
- Added DeepSeek v3 and DeepSeek R1.
12/18/2024
- o1 in the API comes with support for function calling, developer messages, Structured Outputs, and vision capabilities.
12/07/2024
- Added Llama 3.3
11/05/2024
- Added Claude Haiku 3.5
11/03/2024
- Added Gemini 1.5 Flash-8B
10/04/2024
- Added Llama 3.2
- Added gpt-4o-realtime-preview
09/25/2024
- Updated Google Gemini
09/13/2024
- Added OpenAI’s latest model: o1.
08/07/2024
- Added gpt-4o-2024-08-06, the latest gpt-4o snapshot that supports Structured Outputs
07/25/2024
- Updated Mistral Large 2
07/24/2024
- Added Llama 3.1 405B
07/20/2024
- Added GPT-4o-mini
- Updated prices
07/13/2024
- Updated
- Added Cohere’s Command API









