OpenAI API Rate Limits (2025 Update): GPT-5, Sora 2 & Free Tier

Full 2025 OpenAI rate limits by model and tier. Compare RPM, TPM, and batch limits for GPT-5, Sora 2, GPT Image 1, and free trial users with clear tables.

The OpenAI API applies rate limits to every account to keep the platform stable and fair.

These limits control how many requests and tokens you can send per minute, and they vary by model, tier, and payment history.

This page lists all current OpenAI API rate limits (2025), including free trial and Tier 1–5 users, with quick-reference charts for GPT-5, o1, o3, o4-mini, Sora 2, and GPT Image 1 models.

Quick Summary Table (2025)

ModelFree TierTier 1Tier 3Tier 5Notes
GPT-5Not supported500 RPM / 500 K TPM5 K RPM / 2 M TPM15 K RPM / 40 M TPMFlagship text model
GPT-5 MiniNot supported500 RPM / 500 K TPM5 K RPM / 4 M TPM30 K RPM / 180 M TPMCompact variant
o1Not supported500 RPM / 30 K TPM5 K RPM / 800 K TPM10 K RPM / 30 M TPMReasoning model
o3Not supported500 RPM / 30 K TPM5 K RPM / 800 K TPM10 K RPM / 150 M TPMFast general model
o4-miniNot supported1 K RPM / 100 K TPM5 K RPM / 4 M TPM30 K RPM / 150 M TPMVision-capable model
gpt-image-1Not supported5 img/min50 img/min250 img/minImage generation model
sora-2Not supported25 RPM125 RPM375 RPMVideo generation model

Understanding the Three Main Rate Limits

OpenAI measures your API use in three main ways. Hitting any one of these limits will cause your requests to be temporarily blocked.

  • RPM (Requests Per Minute): This is the total number of API requests you can make in one minute.
  • TPM (Tokens Per Minute): This is the total number of tokens your account can process in one minute. Tokens are small pieces of text, where 1,000 tokens is about 750 words. Both your input and the model’s output count towards this limit.
  • RPD (Requests Per Day): This is the maximum number of requests you can make over 24 hours.

How OpenAI Rate Limits Work

OpenAI applies rate limits at the organization level. The limits vary based on your endpoint, account tier, and usage history. OpenAI tracks your usage and automatically adjusts your limits as you prove reliable payment history.

Current Rate Limits by Usage Tier

OpenAI automatically assigns you to usage tiers based on your payment history and API usage patterns. Higher tiers get better rate limits and access to newer models.

TIERQUALIFICATIONMAX CREDITS
FreeUser must be in an allowed geography$100
Tier 1$5 paid$100
Tier 2$50 paid and 7+ days since first successful payment$500
Tier 3$100 paid and 7+ days since first successful payment$1,000
Tier 4$250 paid and 14+ days since first successful payment$5,000
Tier 5$1,000 paid and 30+ days since first successful payment$200,000

Rate Limits For Pay-as-you-go Users (Tier 1 – Tier 5)

ModelRPMRPDTPMBatch Queue Limit
gpt-5500 (T1)
5,000 (T2)
5,000 (T3)
10,000 (T4)
15,000 (T5)
500,000 (T1)
1,000,000 (T2)
2,000,000 (T3)
4,000,000 (T4)
40,000,000 (T5)
150,000 (T1)
3,000,000 (T2)
100,000,000 (T3)
200,000,000 (T4)
15,000,000,000 (T5)
gpt-5-mini500 (T1)
5,000 (T2)
5,000 (T3)
10,000 (T4)
30,000 (T5)
500,000 (T1)
2,000,000 (T2)
4,000,000 (T3)
10,000,000 (T4)
180,000,000 (T5
5,000,000 (T1)
20,000,000 (T2)
40,000,000 (T3)
1,000,000,000 (T4)
15,000,000,000 (T5)
gpt-5-nano500 (T1)
5,000 (T2)
5,000 (T3)
10,000 (T4)
30,000 (T5)
200,000 (T1)
2,000,000 (T2)
4,000,000 (T3)
10,000,000 (T4)
180,000,000 (T5)
2,000,000 (T1)
20,000,000 (T2)
40,000,000 (T3)
1,000,000,000 (T4)
15,000,000,000 (T5)
gpt-5-pro500 (T1)
5,000 (T2)
5,000 (T3)
10,000 (T4)
10,000 (T5)
30,000 (T1)
450,000 (T2)
800,000 (T3)
2,000,000 (T4)
30,000,000 (T5)
90,000 (T1)
1,350,000 (T2)
50,000,000 (T3)
200,000,000 (T4)
5,000,000,000 (T5)
gpt-4.1500 (T1)
5,000 (T2)
5,000 (T3)
10,000 (T4)
10,000 (T5)
30,000 (T1)
450,000 (T2)
800,000 (T3)
2,000,000 (T4)
30,000,000 (T5)
90,000 (T1)
1,350,000 (T2)
50,000,000 (T3)
200,000,000 (T4)
5,000,000,000 (T5)
gpt-4.1-mini500 (T1)
5,000 (T2)
5,000 (T3)
10,000 (T4)
30,000 (T5)
10,000 (T1)200,000 (T1)
2,000,000 (T2)
4,000,000 (T3)
10,000,000 (T4)
150,000,000 (T5)
2,000,000 (T1)
20,000,000 (T2)
40,000,000 (T3)
1,000,000,000 (T4)
15,000,000,000 (T5)
gpt-4.1-nano500 (T1)
5,000 (T2)
5,000 (T3)
10,000 (T4)
30,000 (T5)
10,000 (T1)200,000 (T1)
2,000,000 (T2)
4,000,000 (T3)
10,000,000 (T4)
150,000,000 (T5)
2,000,000 (T1)
20,000,000 (T2)
40,000,000 (T3)
1,000,000,000 (T4)
15,000,000,000 (T5)
o4-mini1,000 (T1)
2,000 (T2)
5,000 (T3)
10,000 (T4)
30,000 (T5)
100,000 (T1)
200,000 (T2)
4,000,000 (T3)
10,000,000 (T4)
150,000,000 (T5)
1,000,000 (T1)
2,000,000 (T2)
40,000,000 (T3)
1,000,000,000 (T4)
15,000,000,000 (T5)
o4-mini-deep-research
1,000 (T1)
2,000 (T2)
5,000 (T3)
10,000 (T4)
30,000 (T5)
200,000 (T1)
2,000,000 (T2)
4,000,000 (T3)
10,000,000 (T4)
150,000,000 (T5)
200,000 (T1)
300,000 (T2)
500,000 (T3)
2,000,000 (T4)
10,000,000 (T5)
o3-pro500 (T1)
5,000 (T2)
5,000 (T3)
10,000 (T4)
10,000 (T5)
30,000 (T1)
450,000 (T2)
800,000 (T3)
2,000,000 (T4)
30,000,000 (T5)
90,000 (T1)
1,350,000 (T2)
50,000,000 (T3)
200,000,000 (T4)
5,000,000,000 (T5)
o3500 (T1)
5,000 (T2)
5,000 (T3)
10,000 (T4)
10,000 (T5)

30,000 (T1)
450,000 (T2)
800,000 (T3)
2,000,000 (T4)
30,000,000 (T5)
90,000 (T1)
1,350,000 (T2)
50,000,000 (T3)
200,000,000 (T4)
5,000,000,000 (T5)
o3-mini1,000 (T1)
2,000 (T2)
5,000 (T3)
10,000 (T4)
30,000 (T5)
100,000 (T1)
200,000 (T2)
4,000,000 (T3)
10,000,000 (T4)
150,000,000 (T5)
1,000,000 (T1)
2,000,000 (T2)
40,000,000 (T3)
1,000,000,000 (T4)
15,000,000,000 (T5)
o3-deep-research500 (T1)
5,000 (T2)
5,000 (T3)
10,000 (T4)
10,000 (T5)
20,000 (T1)
450,000 (T2)
800,000 (T3)
2,000,000 (T4)
30,000,000 (T5)
200,000 (T1)
300,000 (T2)
500,000 (T3)
2,000,000 (T4)
10,000,000 (T5)
o1-pro500 (T1)
5,000 (T2)
5,000 (T3)
10,000 (T4)
10,000 (T5)
30,000 (T1)
450,000 (T2)
800,000 (T3)
2,000,000 (T4)
30,000,000 (T5)
90,000 (T1)
1,350,000 (T2)
50,000,000 (T3)
200,000,000 (T4)
5,000,000,000 (T5)
o1500 (T1)
5,000 (T2)
5,000 (T3)
10,000 (T4)
10,000 (T5)
30,000 (T1)
450,000 (T2)
800,000 (T3)
2,000,000 (T4)
30,000,000 (T5)
90,000 (T1)
1,350,000 (T2)
50,000,000 (T3)
200,000,000 (T4)
5,000,000,000 (T5)
o1-mini1,000 (T1)
2,000 (T2)
5,000 (T3)
10,000 (T4)
30,000 (T5)
100,000 (T1)
200,000 (T2)
4,000,000 (T3)
10,000,000 (T4)
150,000,000 (T5)
1,000,000 (T1)
2,000,000 (T2)
40,000,000 (T3)
1,000,000,000 (T4)
15,000,000,000 (T5)
Sora 2 Pro10 (T1)
25 (T2)
50 (T3)
75 (T4)
150 (T5)
Sora25 (T1)
50 (T2)
125 (T3)
200 (T4)
375 (T5)
gpt-4o500 (T1)
5,000 (T2)
5,000 (T3)
10,000 (T4)
10,000 (T5)
30,000 (T1)
450,000 (T2)
800,000 (T3)
2,000,000 (T4)
30,000,000 (T5)
90,000 (T1)
1,350,000 (T2)
50,000,000 (T3)
200,000,000 (T4)
5,000,000,000 (T5)
gpt-4o-mini500 (T1)
5,000 (T2)
5,000 (T3)
10,000 (T4)
30,000 (T5)
10,000 (T1)200,000 (T1)
2,000,000 (T2)
4,000,000 (T3)
10,000,000 (T4)
150,000,000 (T5)
2,000,000 (T1)
20,000,000 (T2)
40,000,000 (T3)
1,000,000,000 (T4)
15,000,000,000 (T5)
gpt-4o-audio500 (T1)
5,000 (T2)
5,000 (T3)
10,000 (T4)
30,000 (T5)
30,000 (T1)
450,000 (T2)
800,000 (T3)
2,000,000 (T4)
30,000,000 (T5)
90,000 (T1)
1,350,000 (T2)
50,000,000 (T3)
200,000,000 (T4)
5,000,000,000 (T5)
gpt-4o-mini-audio500 (T1)
5,000 (T2)
5,000 (T3)
10,000 (T4)
30,000 (T5)
10,000 (T1)200,000 (T1)
2,000,000 (T2)
4,000,000 (T3)
10,000,000 (T4)
150,000,000 (T5)
2,000,000 (T1)
20,000,000 (T2)
40,000,000 (T3)
1,000,000,000 (T4)
15,000,000,000 (T5)
GPT-4o Realtime200 (T1)
400 (T2)
5,000 (T3)
10,000 (T4)
20,000 (T5)
10,000 (T1)40,000 (T1)
200,000 (T2)
800,000 (T3)
4,000,000 (T4)
15,000,000 (T5)
GPT-4o Mini Realtime200 (T1)
400 (T2)
5,000 (T3)
10,000 (T4)
20,000 (T5)
10,000 (T1)40,000 (T1)
200,000 (T2)
800,000 (T3)
4,000,000 (T4)
15,000,000 (T5)
gpt-image-15 img/min (T1)
20 img/min (T2)
50 img/min (T3)
100 img/min (T4)
250 img/min (T5)
100,000 (T1)
250,000 (T2)
800,000 (T3)
3,000,000 (T4)
8,000,000 (T5)
gpt-image-1-mini5 img/min (T1)
20 img/min (T2)
50 img/min (T3)
150 img/min (T4)
250 img/min (T5)
100,000 (T1)
250,000 (T2)
800,000 (T3)
3,000,000 (T4)
8,000,000 (T5)
dall-e-3500 img/min (T1)
2500 img/min (T2)
5000 img/min (T3)
7500 img/min (T4)
10,000 img/min (T5)
dall-e-2500 img/min (T1)
2500 img/min (T2)
5000 img/min (T3)
7500 img/min (T4)
10,000 img/min (T5)
GPT-4o mini TTS500 (T1)
2,000(T2)
5,000 (T3)
10,000 (T4)
10,000 (T5)
50,000 (T1)
150,000 (T2)
600,000 (T3)
2,000,000 (T4)
8,000,000 (T5)
tts-1500 (T1)
2,000(T2)
5,000 (T3)
10,000 (T4)
10,000 (T5)
TTS-1 HD500 (T1)
2,000(T2)
5,000 (T3)
10,000 (T4)
10,000 (T5)

Free Tier Rate Limits (2025)

Note: Free / trial API accounts may not have access to all models (e.g. GPT-5, o4-mini) or may be subject to stricter limits. The numbers here assume model availability when permitted by OpenAI.

ModelTPMRPMRPDTPD
Chat
gpt-5 (Not supported)10,0003200900,000
gpt-5-mini (Not supported)60,0003200200,000
gpt-5-nano (Not supported)60,0003200200,000
gpt-5-pro (Not supported)30,000320090,000
gpt-5-search-api (Not supported)3,0003200
gpt-4.110,0003200900,000
gpt-4.1 (long context)60,0003200200,000
gpt-4.1-mini60,0003200200,000
gpt-4.1-mini (long context)120,0003200400,000
gpt-4.1-nano60,0003200200,000
gpt-4.1-nano (long context)120,0003200400,000
gpt-4o10,0003200900,000
gpt-4o-audio-preview150,0003200
gpt-4o-search-preview3,0003200
gpt-4o-transcribe200
gpt-4o-mini60,0003200200,000
gpt-4o-mini-search-preview3,0003200
gpt-4o-mini-transcribe200
gpt-3.5-turbo40,0003200200,000
gpt-3.5-turbo-012540,0003200200,000
gpt-3.5-turbo-110640,0003200200,000
gpt-3.5-turbo-16k40,0003200540,000
gpt-3.5-turbo-instruct90,0003200200,000
gpt-3.5-turbo-instruct-091490,0003200200,000
Text
o1150,0003200 90,000
o1-mini150,0003200 –
o3100,0003200 90,000
o3-mini1,000,000150200,000
o4-mini100,0003200 90,000
babbage-002150,0003200 –
davinci-002150,0003200 –
text-embedding-3-large40,0001002,000
text-embedding-3-small40,0001002,000
text-embedding-ada-00240,0001002,000
Audio
gpt-4o-mini-tts200 
tts-1150,0003200 
tts-1-1106150,0003200 
tts-1-hd150,0003200 
tts-1-hd-1106150,0003200 
whisper-1150,0003200 
Moderation
omni-moderation-2024-09-265,00025010,000 
omni-moderation-latest5,00025010,000 
text-moderation-stable150,0003200 
text-moderation-latest150,0003200 
text-moderation-stable150,0003200 
Fine-tuning Inference
babbage-002150,0003
davinci-002150,0003
gpt-3.5-turbo-012540,0003
gpt-3.5-turbo-061340,0003
gpt-3.5-turbo-110640,0003
gpt-4-061340,0003
gpt-4o-2024-05-1310,0003
gpt-4o-mini-2024-07-1860,0003
Fine-tuning TrainingACTIVE / QUEUED JOBSJOBS PER DAY
babbage-002348
davinci-002348
gpt-3.5-turbo-0613348
Image
DALL·E 2150,000 TPM, 3 RPM, 200 RPD, 5 images per minute
DALL·E 3150,000 TPM, 3 RPM, 200 RPD
gpt-image-1 (Not supported)3 RPM, 200 RPD
gpt-image-1-mini (Not supported)3 RPM, 200 RPD
Video
sora-2 (Not supported)150,000 TPM, 200 RPD
sora-2-pro (Not supported)150,000 TPM, 200 RPD
Other
Default limits for all other models150,0003200 

Rate Limits vs Token Limits: What’s the Difference

Rate limits control API access over time. Token limits control individual request size. These are separate constraints that both affect your applications.

Rate limits reset on a time schedule. Token limits apply to each individual request and can’t be increased. For example, GPT-4 models have maximum context windows that you can’t exceed regardless of your rate limits.

You can work around rate limits by spacing out requests or upgrading your tier. Token limits require splitting large requests into smaller chunks.

See Also: What Is The Max Token Limit In OpenAI ChatGPT

What Happens When You Hit Rate Limits

When you exceed any rate limit, OpenAI returns a 429 “Too Many Requests” error. Your application won’t receive a response until the limit resets.

The error message tells you which limit you hit and when you can try again. Here’s what a typical error looks like:

Rate limit reached for gpt-4 in organization org-abc123 on requests per min. 
Limit: 3 / min. Current: 4 / min. Please try again in 20s.

Your application needs to handle these errors gracefully to avoid breaking user experience.

How to Increase Your Rate Limits

Rate limits increase automatically as you demonstrate reliable usage patterns. Consistent API usage and successful payments signal to OpenAI that you need higher limits.

You can also request manual increases for legitimate business needs. Contact OpenAI support with details about your use case and expected traffic patterns.

Enterprise customers get custom rate limits based on their specific requirements. This option works best for large-scale applications with predictable usage patterns.

Best Practices for Handling Rate Limits

Implement Exponential Backoff

When you hit a rate limit, wait before retrying. Start with a short delay and increase it with each failed attempt. This prevents hammering the API and gives limits time to reset.

Switch to fallback models when your primary model hits limits. This keeps your application responsive even when preferred models are throttled.

Use Batch Processing

If real-time responses aren’t needed, use batch API processing to reduce API calls. Batch requests are more efficient and help you stay within rate limits.

Monitor Your Usage

Track your API usage patterns to predict when you might hit limits. Most rate limit errors happen during usage spikes, so monitoring helps you plan capacity.

Cache Responses When Possible

Store API responses for repeated requests. This reduces your total API calls and helps you stay within limits during high-traffic periods.

Request Tier Upgrades

If you consistently hit rate limits, contact OpenAI support to request a tier upgrade. Higher tiers get better limits and priority access.

Monitoring and Troubleshooting

Check your current usage and limits in the OpenAI dashboard. This shows your tier, current usage, and remaining capacity across all limit types.

Set up monitoring alerts before you hit limits. This gives you time to implement rate limit handling or request increases before users experience errors.

Log rate limit errors with enough detail to understand patterns. Track which endpoints, times of day, and usage patterns trigger limits most often.

Future-Proofing Your Application

Rate limits will change as OpenAI scales and adds new models. Build your application to handle different limit types and error responses gracefully.

Keep your rate limit handling code flexible. Hard-coding specific limits or retry strategies makes your application brittle when OpenAI updates their systems.

Plan for growth by designing systems that can handle rate limits at any scale. This includes user feedback, graceful degradation, and alternative approaches when limits are reached.

Relevant Resources

FAQs

Q: What’s the difference between free and paid rate limits?
A: Free accounts are heavily restricted, usually 3 requests per minute and limited token throughput. Paid users get higher limits depending on tier.

Q: Do higher tiers also unlock faster model responses?
A: No. Rate limits affect request volume, not model speed. Model response time depends on model type and input length.

Q: How can I increase my rate limit faster?
A: Maintain a consistent usage pattern and verified payment. Accounts that show stable traffic are automatically upgraded.

Q: Why am I still getting 429 errors even below my limits?
A: This can occur from organization-level throttling or simultaneous usage from multiple API keys under the same org.

Q: Are limits the same for all endpoints?
A: No. Each API (Chat, Embeddings, Audio, Images, etc.) has distinct limits defined by endpoint and model type.

Q: Can I request custom limits for enterprise use?
A: Yes. Large-scale or production-level users can apply for custom rate limits via OpenAI Support.

See Also:

Changelog:

10/18/2025

  • Update rate limits

09/12/2025

  • gpt-5 and gpt-5-mini API rate limits are now more than doubled for T1-T4 tiers

08/07/2025

  • Updated for GPT-5

04/25/2025

  • Updated for gpt-4.1, o4, o3.
  • Removed old models like gpt-3.5

12/18/2024

  • Updated for o1

10/07/2024

  • Added gpt-4o-realtime-preview

10/02/2024

  • Update o1-preview & o1-mini

09/13/2024

  • Added o1-preview & o1-mini

08/07/2024

  • Updated for GPT-4o-mini
  • Clean up

05/14/2024

  • Updated for GPT-4o

One comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Get the latest & top AI tools sent directly to your email.

Subscribe now to explore the latest & top AI tools and resources, all in one convenient newsletter. No spam, we promise!