What Are The Rate Limits For OpenAI API?

The Rate Limits For GPT-4: 10K - 300K RPM (requests per minute) based on usage tiers.

APIs like OpenAI’s allow developers to integrate powerful AI into their applications easily. However, to prevent abuse, APIs enforce rate limits on requests. Let’s look at how OpenAI’s rate limits work.

The OpenAI API enforces limits at the organization level based on the endpoint and account type. There are three key metrics:

  • RPM (requests per minute) – The maximum requests allowed per minute
  • RPD (requests per day) – The maximum requests allowed per day
  • TPM (tokens per minute) – The maximum tokens allowed to be sent per minute

What Is The Rate Limit For GPT-4

Note that your rate limit and spending limit (quota) are automatically adjusted based on a number of factors. As your usage of the OpenAI API goes up and you successfully pay the bill, OpenAI automatically increases your usage tier:

Quick Overview

TIERQUALIFICATIONMAX CREDITSREQUEST LIMITSTOKEN LIMITS
FreeUser must be in an allowed geography$1003 RPM
200 RPD
10K TPM
Tier 1$5 paid$100500 RPM
10K RPD
20K TPM
Tier 2$50 paid and 7+ days since first successful payment$2505000 RPM40K TPM
Tier 3$100 paid and 7+ days since first successful payment$5005000 RPM80K TPM
Tier 4$250 paid and 14+ days since first successful payment$100010K RPM300K TPM
Tier 5$1000 paid and 30+ days since first successful payment$100010K RPM300K TPM

Rate Limits For Free Trial Users

ModelTPMRPMRPD
Chat
gpt-3.5-turbo40,0003200
gpt-3.5-turbo-030140,0003200
gpt-3.5-turbo-061340,0003200
gpt-3.5-turbo-110640,0003200
gpt-3.5-turbo-16k40,0003200
gpt-3.5-turbo-16k-061340,0003200
gpt-3.5-turbo-instruct150,0003200
gpt-3.5-turbo-instruct-0914150,0003200
Text
ada150,0003200
ada-code-search-code150,0003200
ada-code-search-text150,0003200
ada-search-document150,0003200
ada-search-query150,0003200
ada-similarity150,0003200
babbage150,0003200
babbage-002150,0003200
babbage-code-search-code150,0003200
babbage-code-search-text150,0003200
babbage-search-document150,0003200
babbage-search-query150,0003200
babbage-similarity150,0003200
code-davinci-edit-001150,0003200
code-search-ada-code-001150,0003200
code-search-ada-text-001150,0003200
code-search-babbage-code-001150,0003200
code-search-babbage-text-001150,0003200
curie150,0003200
curie-instruct-beta150,0003200
curie-search-document150,0003200
curie-search-query150,0003200
curie-similarity150,0003200
davinci150,0003200
davinci-instruct-beta150,0003200
davinci-search-document150,0003200
davinci-search-query150,0003200
davinci-similarity150,0003200
text-ada-001150,0003200
text-babbage-001150,0003200
text-curie-001150,0003200
text-davinci-001150,0003200
text-davinci-002150,0003200
text-davinci-003150,0003200
text-davinci-edit-001150,0003200
text-embedding-ada-002150,0003200
text-search-ada-doc-001150,0003200
text-search-ada-query-001150,0003200
text-search-babbage-doc-001150,0003200
text-search-babbage-query-001150,0003200
text-search-curie-doc-001150,0003200
text-search-curie-query-001150,0003200
text-search-davinci-doc-001150,0003200
text-search-davinci-query-001150,0003200
text-similarity-ada-001150,0003200
text-similarity-babbage-001150,0003200
text-similarity-curie-001150,0003200
text-similarity-davinci-001150,0003200
tts-1150,0003200
tts-1-1106150,0003200
tts-1-hd150,0003200
tts-1-hd-1106150,0003200
Moderation
text-moderation-latest150,0003
text-moderation-stable150,0003
Fine-tuning Inference
babbage-002150,0003
davinci-002150,0003
gpt-3.5-turbo-061340,0003
Fine-tuning TrainingACTIVE / QUEUED JOBSJOBS PER DAY
babbage-002348
davinci-002348
gpt-3.5-turbo-0613348
Image
DALL·E 23 RPM, 200 RPD, 5 images per minute
DALL·E 33 RPM, 200 RPD, 1 images per minute
Audio
whisper-13200
Other
Default limits for all other models150,0003200

Rate Limits For Pay-as-you-go Users (Tier 1)

ModelTPMRPM
Chat
gpt-3.5-turbo90,0003,500
gpt-3.5-turbo-030190,0003,500
gpt-3.5-turbo-061390,0003,500
gpt-3.5-turbo-1106180,0003,500
gpt-3.5-turbo-16k180,0003,500
gpt-3.5-turbo-16k-0613180,0003,500
gpt-3.5-turbo-instruct250,0003,000
gpt-3.5-turbo-instruct-0914250,0003,000
gpt-410,000500
gpt-4-031410,000500
gpt-4-061310,000500
gpt-4-1106-preview10,00020 RPM, 100RPD
gpt-4-vision-preview10,00020 RPM, 100RPD
Text
ada250,0003,000
ada-code-search-code250,0003,000
ada-code-search-text250,0003,000
ada-search-document250,0003,000
ada-search-query250,0003,000
ada-similarity250,0003,000
babbage250,0003,000
babbage-002250,0003,000
babbage-code-search-code250,0003,000
babbage-code-search-text250,0003,000
babbage-search-document250,0003,000
babbage-search-query250,0003,000
babbage-similarity250,0003,000
code-davinci-edit-001150,00020
code-search-ada-code-001250,0003,000
code-search-ada-text-001250,0003,000
code-search-babbage-code-001250,0003,000
code-search-babbage-text-001250,0003,000
curie250,0003,000
curie-instruct-beta250,0003,000
curie-search-document250,0003,000
curie-search-query250,0003,000
curie-similarity250,0003,000
davinci250,0003,000
davinci-002250,0003,000
davinci-instruct-beta250,0003,000
davinci-search-document250,0003,000
davinci-search-query250,0003,000
davinci-similarity250,0003,000
text-ada-001250,0003,000
text-babbage-001250,0003,000
text-curie-001250,0003,000
text-davinci-001250,0003,000
text-davinci-002250,0003,000
text-davinci-003250,0003,000
text-davinci-edit-001150,00020
text-embedding-ada-0021,000,0003,000
text-search-ada-doc-001250,0003,000
text-search-ada-query-001250,0003,000
text-search-babbage-doc-001250,0003,000
text-search-babbage-query-001250,0003,000
text-search-curie-doc-001250,0003,000
text-search-curie-query-001250,0003,000
text-search-davinci-doc-001250,0003,000
text-search-davinci-query-001250,0003,000
text-similarity-ada-001250,0003,000
text-similarity-babbage-001250,0003,000
text-similarity-curie-001250,0003,000
text-similarity-davinci-001250,0003,000
tts-150
tts-1-1106250,0003,000
tts-1-hd50
tts-1-hd-1106250,0003,000
Moderation
text-moderation-latest150,0001,000
text-moderation-stable150,0001,000
Fine-tuning Inference
babbage-002250,0003,000
davinci-002250,0003,000
gpt-3.5-turbo-061390,0003,500
Fine-tuning TrainingACTIVE / QUEUED JOBSJOBS PER DAY
babbage-002348
davinci-002348
gpt-3.5-turbo-0613348
Imageimg / min 
DALL·E 25
DALL·E 35
Audio
whisper-150
Other
Default limits for all other models250,0003,000

What The Differences Between Rate Limits And Token Limits

Rate limits restrict the number of API requests. Token limits restrict the number of tokens (usually words) sent to a model per request. For example, gpt-4-32k-0613 has a max of 32,768 tokens per request. You can’t increase the token limit, only reduce the number of tokens per request.

See Also: What Is The Max Token Limit In OpenAI ChatGPT

What Is TPM

TPM, or Tokens Per Minute, refers to the number of tokens your organization can send to the OpenAI API within a minute. Tokens are chunks of data, such as words or characters, that the model processes. The TPM limit ensures that the server can handle the volume of data being processed without being overwhelmed.

What Is RPM

RPM, or Requests Per Minute, measures how many requests your organization can make to the OpenAI API within a minute. This limit is set to prevent overloading the server and to ensure fair usage among all users. The exact number varies depending on the endpoint used and the type of account you have.

What Is RPD

RPD, or Requests Per Day, is another rate limit set by OpenAI. It determines the total number of requests your organization can make to the API within a 24-hour period. It is worth noting that RPD has no limit for Pay-as-you-go Users.

What Happens If The Rate Limit Is Reached

If your organization reaches its rate limit, the OpenAI API will stop fulfilling further requests until enough time has passed. This is to prevent server overload and maintain service quality. If you encounter a rate limit error, it means you’ve exceeded your limit and need to wait before making more requests:

Rate limit reached for default-text-davinci-002 in organization org-{id} on requests per min. Limit: 20.000000 / min. Current: 24.000000 / min.

References

Leave a Reply

Your email address will not be published. Required fields are marked *