AI LLM API Pricing 2025: GPT-5.1, Gemini 3, Claude 4.5, and More

The latest API pricing for popular AI models like GPT-5.1, Sora 2, o4-mini, o3, Claude 4.5, Gemini 3 Pro, Grok 4.1, and more.

The rise of powerful AI models like GPT, Gemini, Claude, Mistral, Llama, and others has opened doors for AI developers, entrepreneurs, and startups. But with so many options available, choosing the right model for your project can feel overwhelming, especially when considering cost.

This post lists the pricing structures of popular AI models, comparing their sizes, context window, and costs per million tokens to help you make informed decisions about your AI projects. We obtain the price information of these AI APIs from their official websites and will update it frequently to maintain its accuracy.

TL;DR

ModelContext WindowInput/1M TokensOutput/1M Tokens
GPT-5.1/GPT-5400K$1.25$10.00
Sora 2$0.10/second
Claude Sonnet 4.5200K$3.00$15.00
Gemini 3 Pro200K$2.00$12.00
Gemini 2.5 Pro200K$1.25$10.00
Gemini 2.5 Flash1M$0.15 (text/image/video)
$1.00 (audio)
Non-thinking: $0.60
Thinking: $3.50
Grok 4256,000$3.00$15.00
DeepSeek-V3.2-Exp128K$0.28$0.42
Qwen-Max262,144$1.20$6.00
MiniMax M2204,800$0.30$1.20
Kimi K2 Thinking262,144$0.60$2.50

OpenAI GPT Models API Pricing

ModelContext WindowInput/1M TokensOutput/1M Tokens
gpt-5.1400K$1.25$10.00
gpt-5400K$1.25$10.00
gpt-5-mini400K$0.25$2.00
gpt-5-nano400K$0.05$0.40
gpt-5-pro400K$15.00$120.00
gpt-5-codex$1.25$10.00
gpt-5-codex-mini$1.50$6.00
gpt-5-search-api$1.25$10.00
gpt-4.11M$2.00$8.00
gpt-4.1-mini1M$0.40$1.60
gpt-4.1-nano1M$0.10$0.40
gpt-4o128K$2.50$10.00
gpt-4o-mini128K$0.15$0.60
gpt-realtime128K$4.00$16.00
gpt-realtime-mini128K$0.60$2.40
gpt-4o-realtime-preview128K$5.00$20.00
gpt-4o-mini-realtime-preview128K$0.60$2.40
gpt-4o-audio-preview128K$2.50$10.00
gpt-4o-mini-audio-preview128K$0.15$0.60
From OpenAI

OpenAI Reasoning Models API Pricing

ModelContext WindowInput/1M TokensOutput/1M Tokens
o4-mini200K$1.10$4.40
o4-mini-deep-research200K$2.00$8.00
o3-pro200K$20.00$80.00
o3200K$2.00$8.00
o3-deep-research200K$10.00$40.00
o3-mini200K$1.10$4.40
o1-pro200K$150.00$600.00
o1200K$15.00$60.00
o1-mini128K$1.10$4.40
From OpenAI

OpenAI Video Generation API Pricing

ModelPrice per second
Sora 2$0.10
Sora 2 Pro$0.30
Sora 2 Pro (Portrait: 1024 x 1792. Landscape: 1792 x 1024)$0.50
From OpenAI

OpenAI Image Generation API Pricing

ModelInput/1M TokensOutput/1M Tokens
GPT-image-1$10.00$40.00
GPT-image-1-mini$2.50$8.00
From OpenAI

Claude 4 API Pricing

ModelContext WindowInput/1M TokensOutput/1M Tokens
Claude Opus 4.1200K$15.00$75.00
Claude Sonnet 4.5≤ 200K$3.00$15.00
Claude Sonnet 4.5> 200K$6.00$22.50
Claude Haiku 4.5200K$1.00$5.00
From Anthropic

Gemini API Pricing

ModelContext WindowInput/1M TokensOutput/1M Tokens
Gemini 3 Pro>200K$4.00$18.00
Gemini 3 Pro200K$2.00$12.00
Gemini 2.5 Pro>200K$2.50$15.00
Gemini 2.5 Pro200K$1.25$10.00
Gemini 2.5 Flash1M$0.30 (text/image/video)
$1.00 (audio)
$2.50
Gemini 2.5 Flash-Lite1M$0.10 (text/image/video)
$0.50 (audio)
$0.40
Gemini 2.0 Flash1M$0.10
$0.70 (audio)

$0.40
Gemini 2.0 Flash-Lite1M$0.075$0.30
Gemini 1.5 Pro>128K$2.50$10.00
Gemini 1.5 Pro128K$1.25$5.00
Gemini 1.5 Flash>128K$0.15$0.60
Gemini 1.5 Flash128K$0.075$0.30
Gemini 1.5 Flash-8B>128K$0.075$0.30
Gemini 1.5 Flash-8B128K$0.0375$0.15
Gemini 1.0 Pro32K$0.50$1.50
From Google

Gemini 2.5 Flash Native Audio API Pricing

ModelFree TierInput/1M TokensOutput/1M Tokens
Gemini 2.5 Flash Native AudioNot available$0.50 (text)
$3.00 (audio / video)
$2.00 (text)
$12.00 (audio)
From Google

Gemini 2.5 Flash Image Preview Pricing

ModelFree TierInput/1M TokensOutput/1M Tokens
Gemini 2.5 Flash Image PreviewNot available$0.30 (text / image)$0.039 per image
From Google

Gemini 2.5 TTS Pricing

ModelFree TierInput/1M TokensOutput/1M Tokens
Gemini 2.5 Flash Preview TTSFree of charge$0.50 (text)$10.00 (audio)
Gemini 2.5 Pro Preview TTSNot available$1.00 (text)$20.00 (audio)
From Google

Google Imagen 4 & 3 API Pricing

ModelPaid Tier, per Image in USD
Imagen 4 Fast$0.02
Imagen 4 Standard$0.04
Imagen 4 Ultra$0.06
Imagen 3$0.03
From Google

Gemini Embedding API Pricing

ModelPaid Tier, per 1M tokens in USD
gemini-embedding-001$0.15
From Google

Google Veo 3.1 API Pricing

ModelPaid Tier, per second in USD
Veo 3.1 Standard$0.40
Veo 3.1 Fast$0.15
From Google

Grok API Pricing

ModelContext WindowInput/1M TokensOutput/1M Tokens
Grok 4 Fast2M$0.20$0.50
Grok 4256,000$3.00$15.00
grok-code-fast-1256,000$0.20$1.50
Grok 3131,072$3.00$15.00
Grok 3 Mini131,072$0.30$0.50
Grok 2 Vision32,768$2.00$10.00
Grok 2 Image0.07/image0.07/image
From xAI

DeepSeek API Pricing

ModelContext WindowInput/1M TokensOutput/1M Tokens
DeepSeek-V3.2-Exp (Non-thinking)128K$0.28$0.42
DeepSeek-V3.2-Exp(Thinking Mode)128K$0.28$0.42
From DeepSeek

Qwen API Pricing

ModelContext WindowInput/1M TokensOutput/1M Tokens
Qwen-Max262,144$1.60$6.40
Qwen 3 (Plus)1,000,000$0.40$0.12
Qwen 3 (Turbo)1,000,000
131,072(Thinking Mode)
$0.05$0.20
Qwen-Flash1,000,000$0.05$0.40
Qwen-Coder1,000,000$0.30$1.50
From Qwen

Mistral (Premier Models) API Pricing

ModelContext WindowInput/1M TokensOutput/1M Tokens
Mistral Large128K$2.00$6.00
Pixtral Large128K$2.00$6.00
Mistral Saba128K$0.20$0.60
Mistral Medium 3128K$0.40$2.00
Magistral Medium128K$2.00$5.00
Devstral Medium128K$0.40$2.00
Codestral32K$0.30$0.90
Document AI & OCROCR: $1/1000 pages
Annotations: $3/1000 pages
Voxtral Mini TranscribeAudio Input/min
$0.002
Mistral Embed32k$0.10
Mistral Moderation 24.1132k$0.10
Magistral Small128K$0.50$1.50
Codestral Embed128K$0.15
From Mistral

Mistral (Open Models) API Pricing

ModelContext WindowInput/1M TokensOutput/1M Tokens
Pixtral Large128K$2.00$6.00
Pixtral 12B128K$0.15$0.15
Mistral Nemo128K$0.15$0.15
Mistral Small 3.2128K$0.10$0.30
Magistral Small128K$0.50$1.50
Devstral Small128K$0.10$0.30
Voxtral Mini$0.001 (audio)
$0.04 (text)
$0.04
Voxtral Small$0.001 (audio)
$0.04 (text)
$0.03
Ministral 8B 24.1032K$0.10$0.10
Mixtral 8x7B32K$0.70$0.70
Mixtral 8x22B64K$2.00$6.00
From Mistral

Llama 4 & 3 API Pricing

ModelContext WindowInput/1M TokensOutput/1M Tokens
Llama 4 Scout10M$0.11$0.34
Llama 4 Maverick10M$0.20$0.60
Llama 3.3 70B Versatile128K$0.59$0.79
Llama 3.3 70B SpecDec8192$0.59$0.99
Llama 3.3 70b Instruct128K$0.23$0.40
Llama 3.3 70b Instruct-Turbo128K$0.13$0.40
Llama 3.2 90b Vision-Instruct128K$0.35$0.40
Llama 3.2 11b Vision-Instruct128K$0.055$0.055
Llama 3.1 405B128K$1.79$1.79
Llama 3.1 70B128K$0.35$0.40
Llama 3.1 8B128K$0.09$0.09
From Groq

GLM API Pricing

ModelContext WindowInput/1M TokensOutput/1M Tokens
GLM-4.6128K$0.60$2.20
GLM-4.5128K$0.60$2.20
GLM-4.5v64K$0.60$1.80
GLM-4.5-X128K$0.45$8.90
GLM-4.5-Air128K$0.20$1.10
GLM-4.5-AirX128K$1.10$4.50
GLM-4.5-Flash128KFREEFREE
From z.ai

Kimi K2 API Pricing

ModelContext WindowInput/1M TokensOutput/1M Tokens
kimi-k2-thinking262,144$0.60$2.50
kimi-k2-thinking-turbo262,144$1.15$8.00
kimi-k2-0905-preview262,144$0.60$2.50
kimi-k2-0711-preview131K$0.60$2.50
kimi-k2-turbo-preview262,144$1.15$8.00
From moonshot

Minimax M2 API Pricing

ModelContext WindowInput/1M TokensOutput/1M Tokens
Minimax M2204,800$0.30$1.20
From Minimax

Minimax Hailuo API Pricing

ModelUnit Price
MiniMax-Hailuo-2.3-Fast$0.19 per 768P, 6s video
MiniMax-Hailuo-2.3-Fast$0.32 per 768P, 10s video
MiniMax-Hailuo-2.3-Fast$0.33 per 1080P, 6s video
MiniMax-Hailuo-2.3$0.28 per 768P, 6s video
MiniMax-Hailuo-2.3$0.56 per 768P, 10s video
MiniMax-Hailuo-2.3$0.49 per 1080P, 6s video
From Minimax

PPLX API Pricing

ModelContext WindowInput/1M TokensOutput/1M Tokens
pplx-70b-online4K$1.00$1.00
pplx-7b-online4K$0.20$0.20
From Perplexity

Cohere API Pricing

ModelContext WindowInput/1M TokensOutput/1M Tokens
Command A256K$2.50$10.00
Command R+128K$2.50$10.00
Command R128K IN/4K OUT$0.15$0.60
Command R7B128K$0.0375$0.15
From Cohere

See Also:

Changelog:

11/18/2025

  • Added Gemini 3 Pro

11/14/2025

  • Added GPT-5.1
  • Updated Qwen models
  • Updated Kimi models
  • Added Minimax M2

11/08/2025

  • Added codex-mini

10/15/2025

  • Update for Claude Haiku 4.5 & Veo 3.1.

10/07/2025

  • Added Sora 2 and more OpenAI APIs.

09/29/2025

  • Updated for Claude 4.5 and DeepSeek-V3.2

09/19/2025

  • Updated price

08/21/2025

  • Updated for DeepSeek-V3.1

08/20/2025

  • Added GLM-4.5 and Kimi K2 models.

08/14/2025

  • Added Imagen 4 Fast

08/07/2025

  • Added GPT-5

08/05/2025

  • Added Claude Opus 4.1

07/15/2025

  • Added Voxtral model family

07/14/2025

  • Added Gemini Embedding

07/11/2025

  • Added Grok 4

06/26/2025

  • Added o3-deep-research and o4-mini-deep-research

06/25/2025

  • Added Imagen 4 Ultra and Imagen 4 Standard

06/20/2025

  • Added Gemini 2.5 Flash-Lite

06/10/2025

  • Added o3-pro
  • Added Mistral Magistral models.

06/10/2025

  • OpenAI dropped the price of o3 by 80%

05/22/2025

  • Updated for Claude 4

05/21/2025

  • Updated Mistral models

05/21/2025

  • Added Gemini 2.5 Flash Native Audio

05/08/2025

  • Added Mistral Medium 3

04/23/2025

  • Added OpenAI Image Generation

04/18/2025

  • Added Gemini 2.5 Flash
  • Updated Cohere models

04/16/2025

  • Added o4-mini

04/14/2025

  • Added gpt-4.1 family

04/11/2025

  • Updated DeepSeek

04/11/2025

  • Added Google Imagen 3 and Veo 2.

04/10/2025

  • Added Qwen models
  • Added Grok 3

04/05/2025

  • Added Gemini Pro 2.5

03/19/2025

  • Added o1-pro

02/28/2025

  • Added GPT-4.5

02/24/2025

  • Added Claude 3.7

02/21/2025

  • Added Grok

02/06/2025

  • Added Gemini 2.0 Flash and Gemini 2.0 Flash-Lite

02/02/2025

  • Added o3-mini

01/31/2025

  • Added DeepSeek v3 and DeepSeek R1.

12/18/2024

  • o1 in the API comes with support for function calling, developer messages, Structured Outputs, and vision capabilities.

12/07/2024

  • Added Llama 3.3

11/05/2024

  • Added Claude Haiku 3.5

11/03/2024

  • Added Gemini 1.5 Flash-8B

10/04/2024

  • Added Llama 3.2
  • Added gpt-4o-realtime-preview

09/25/2024

  • Updated Google Gemini

09/13/2024

  • Added OpenAI’s latest model: o1.

08/07/2024

  • Added gpt-4o-2024-08-06, the latest gpt-4o snapshot that supports Structured Outputs

07/25/2024

  • Updated Mistral Large 2

07/24/2024

  • Added Llama 3.1 405B

07/20/2024

  • Added GPT-4o-mini
  • Updated prices

07/13/2024

  • Updated
  • Added Cohere’s Command API

Leave a Reply

Your email address will not be published. Required fields are marked *

Get the latest & top AI tools sent directly to your email.

Subscribe now to explore the latest & top AI tools and resources, all in one convenient newsletter. No spam, we promise!