Rate Limiting & Quotas
API Codex implements rate limiting to ensure fair usage and maintain service quality for all users. This guide explains how rate limiting works and how to handle it effectively.
Overview
Rate limiting protects our APIs from abuse and ensures reliable performance for all users. Each subscription tier has different limits based on:
- Requests per second (RPS)
- Requests per month
- Concurrent connections
- Payload size limits
Subscription Tiers
Rate Limits by Plan
| Plan | Requests/Month | Requests/Second | Concurrent Connections | Support |
|---|---|---|---|---|
| Basic (Free) | 100-1,000 | 1 RPS | 2 | Community |
| Pro | 10,000 | 10 RPS | 10 | |
| Ultra | 100,000 | 50 RPS | 50 | Priority |
| Mega | 1,000,000+ | 100+ RPS | Unlimited | Dedicated |
Note: Exact limits vary by specific API. Check each API's pricing page on RapidAPI for details.
Rate Limit Headers
All API responses include headers that inform you about your current rate limit status:
Standard Headers
Code
| Header | Description | Example |
|---|---|---|
x-ratelimit-requests-limit | Total requests allowed in current window | 1000 |
x-ratelimit-requests-remaining | Requests remaining in current window | 847 |
x-ratelimit-requests-reset | Unix timestamp when limit resets | 1640995200 |
Reading Rate Limit Headers
Code
Handling Rate Limits
429 Too Many Requests
When you exceed your rate limit, you'll receive a 429 status code:
Code
Implementing Retry Logic
Exponential Backoff
Code
Request Queue Management
For bulk operations, implement a request queue that:
- Tracks requests per time window
- Waits when limit is reached
- Processes requests in order
Rate Limiting Strategies
1. Client-Side Throttling
Track requests in a sliding window and wait if limit is reached before making new requests.
2. Adaptive Rate Limiting
Adjust request rate based on remaining quota:
- > 80% used: Slow down significantly
- < 20% used: Can increase rate
- 0 remaining: Wait until reset
3. Circuit Breaker Pattern
After multiple rate limit errors:
- CLOSED → Normal operation
- OPEN → Reject requests immediately, wait for timeout
- HALF-OPEN → Test with one request, recover or stay open
Optimization Techniques
1. Request Batching
Combine multiple operations where possible. Check if the API supports batch endpoints.
2. Response Caching
Cache responses with appropriate TTLs to reduce API calls:
- DNS lookups: 24 hours (respect TTL from response)
- Email validation: 1 hour
- Text analysis: 30 minutes
3. Parallel Processing with Limits
Process multiple requests in parallel, but limit concurrency to avoid overwhelming the API. Use Promise.race() to maintain a pool of active requests.
Monitoring & Alerts
Set up monitoring to track rate limit usage:
| Alert Level | Condition | Action |
|---|---|---|
| Warning | > 80% usage | Review request patterns |
| Critical | 0 remaining | Reduce request rate, investigate |
Track metrics like average usage, peak usage, and requests per minute to identify patterns and optimize your usage.
Best Practices
Do's ✅
- Always check rate limit headers in responses
- Implement exponential backoff for retries
- Cache responses when appropriate
- Use request queuing for bulk operations
- Monitor your usage proactively
- Implement circuit breakers for resilience
- Batch requests when possible
Don'ts ❌
- Don't ignore 429 responses - Always handle them
- Don't retry immediately - Use backoff strategies
- Don't hammer the API - Respect rate limits
- Don't hardcode delays - Use adaptive timing
- Don't waste quota - Cache when possible
Upgrading Your Plan
If you consistently hit rate limits, consider upgrading:
- Monitor your usage patterns
- Calculate required capacity
- Visit RapidAPI
- Select appropriate plan
- Upgrade seamlessly without code changes
Next Steps
- Learn about Error Handling for robust applications
- Review Best Practices for production deployments
- Explore Authentication for secure API access
- Browse our API Catalog to start building