API Rate Limiting Best Practices for Developers
Why Rate Limiting Exists
Every API has limits. Whether it's 100 requests per minute or 10,000 per day, rate limiting protects API providers from abuse, ensures fair usage, and keeps infrastructure costs manageable.
As a developer consuming APIs, understanding rate limits isn't optional — it's the difference between a reliable application and one that randomly breaks at scale.
Common Rate Limiting Strategies
Fixed Window
The simplest approach. You get N requests per time window (e.g., 100 requests per minute). The counter resets at the start of each window.
- Pro: Easy to understand and implement
- Con: Burst traffic at window boundaries can cause issues
Sliding Window
Instead of resetting at fixed intervals, the window slides with each request. More fair, but harder to predict your remaining quota.
- Pro: Smoother rate enforcement
- Con: Harder to calculate remaining requests
Token Bucket
You start with a bucket of tokens. Each request consumes one token. Tokens refill at a steady rate. Allows short bursts while enforcing an average rate.
- Pro: Handles burst traffic gracefully
- Con: Can be confusing — "I had 100 tokens, now I have 37?"
Leaky Bucket
Requests queue up and are processed at a constant rate. Excess requests overflow (get rejected). Used by APIs that need strict throughput control.
- Pro: Predictable processing rate
- Con: Adds latency during bursts
Reading Rate Limit Headers
Most APIs tell you their limits via HTTP headers. Here are the standard ones:
HTTP/1.1 200 OK
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 847
X-RateLimit-Reset: 1709510460
Retry-After: 30
| Header | Meaning |
|---|---|
X-RateLimit-Limit | Max requests allowed in the window |
X-RateLimit-Remaining | Requests left in the current window |
X-RateLimit-Reset | Unix timestamp when the window resets |
Retry-After | Seconds to wait before retrying (on 429) |
Always parse these headers. Don't guess — let the API tell you exactly when you can send the next request.
Handling 429 Too Many Requests
When you hit a rate limit, the API returns HTTP 429. Here's how to handle it properly:
Exponential Backoff with Jitter
The gold standard for retry logic. Wait progressively longer between retries, with random jitter to prevent thundering herd problems.
async function fetchWithRetry(url, options, maxRetries = 5) {
for (let attempt = 0; attempt < maxRetries; attempt++) {
const response = await fetch(url, options);
if (response.status !== 429) return response;
const retryAfter = response.headers.get('Retry-After');
const baseDelay = retryAfter
? parseInt(retryAfter) * 1000
: Math.pow(2, attempt) * 1000;
// Add jitter: random delay between 0 and baseDelay
const jitter = Math.random() * baseDelay;
const delay = baseDelay + jitter;
console.log(`Rate limited. Retrying in ${Math.round(delay)}ms...`);
await new Promise(resolve => setTimeout(resolve, delay));
}
throw new Error('Max retries exceeded');
}
Key Rules for Retries
- Always respect
Retry-After— If the API tells you when to retry, listen - Add jitter — Without it, all your retried requests hit at the same time
- Set a max retry count — Don't retry forever
- Log rate limit events — If you're hitting limits regularly, you have a design problem
Proactive Strategies
Don't wait for 429 errors. Design your application to stay within limits from the start.
1. Cache Aggressively
The fastest API call is the one you don't make. Cache responses at every layer:
// Simple in-memory cache with TTL
const cache = new Map();
async function cachedFetch(url, ttlMs = 60000) {
const cached = cache.get(url);
if (cached && Date.now() - cached.time < ttlMs) {
return cached.data;
}
const response = await fetch(url);
const data = await response.json();
cache.set(url, { data, time: Date.now() });
return data;
}
2. Batch Requests
Many APIs offer batch endpoints. Use them instead of making individual calls.
// Instead of this (10 API calls):
for (const id of userIds) {
await fetch(`/api/users/${id}`);
}
// Do this (1 API call):
await fetch(`/api/users?ids=${userIds.join(',')}`);
3. Use Webhooks
Instead of polling an API every 30 seconds to check for changes, register a webhook and let the API notify you.
- Polling: 2,880 requests/day per resource
- Webhooks: 0 requests — the API pushes updates to you
4. Implement a Request Queue
For high-volume applications, queue outgoing API calls and process them at a controlled rate:
class RateLimitedQueue {
constructor(requestsPerSecond) {
this.interval = 1000 / requestsPerSecond;
this.queue = [];
this.processing = false;
}
async add(fn) {
return new Promise((resolve, reject) => {
this.queue.push({ fn, resolve, reject });
this.process();
});
}
async process() {
if (this.processing) return;
this.processing = true;
while (this.queue.length > 0) {
const { fn, resolve, reject } = this.queue.shift();
try {
resolve(await fn());
} catch (err) {
reject(err);
}
await new Promise(r => setTimeout(r, this.interval));
}
this.processing = false;
}
}
// Usage: max 10 requests per second
const queue = new RateLimitedQueue(10);
await queue.add(() => fetch('/api/data'));
5. Monitor Your Usage
Track your API consumption proactively. Set alerts at 80% of your limit so you can optimize before you hit errors.
Rate Limit Comparison Across Popular APIs
| API | Free Tier Limit | Paid Limit | Reset Window |
|---|---|---|---|
| GitHub | 60/hr (unauth), 5,000/hr (auth) | 15,000/hr | 1 hour |
| OpenAI | Varies by model | Varies by tier | 1 minute |
| Stripe | 100/sec (live), 25/sec (test) | Custom | Per second |
| Twitter/X | 1,500 tweets/month | 10,000/month | Monthly |
| Google Maps | 28,500/day | Pay-per-use | Daily |
When You're the API Provider
If you're building an API, here's how to implement rate limiting well:
- Return clear headers — Always include
X-RateLimit-*headers - Use 429 status codes — Not 403 or 500
- Include
Retry-After— Tell consumers exactly when to retry - Offer tiered limits — Different plans should have different limits
- Document your limits — Don't make developers discover them by hitting them
Conclusion
Rate limiting is a fact of life when working with APIs. The best developers don't fight it — they design around it with caching, batching, queuing, and smart retry logic.
Browse our API directory to compare rate limits across hundreds of APIs and find the ones that fit your usage patterns.