API

Rate Limiting

Rate limiting is a technique that restricts the number of API requests a client can make within a specified time period, protecting services from abuse, brute-force attacks, and resource exhaustion.

Rate limiting is a fundamental API security control that prevents individual clients from overwhelming services with excessive requests. It works by tracking the number of requests from each client (identified by API key, IP address, user ID, or other identifiers) and rejecting requests that exceed the defined threshold, typically returning an HTTP 429 (Too Many Requests) response.

Common rate limiting algorithms include the fixed window counter, sliding window log, sliding window counter, token bucket, and leaky bucket. Each algorithm offers different trade-offs between accuracy, memory usage, and implementation complexity. The token bucket algorithm is widely used because it allows controlled bursts while maintaining an average rate limit.

From a security standpoint, rate limiting defends against brute-force credential attacks, credential stuffing, API scraping, and denial-of-service attempts. It is important to apply rate limits at multiple layers: per-user, per-IP, per-endpoint, and globally. Rate limit headers such as X-RateLimit-Limit, X-RateLimit-Remaining, and Retry-After should be included in responses to help legitimate clients manage their request patterns. Organizations should also consider implementing graduated rate limiting that becomes stricter after repeated violations, and ensure that rate limiting cannot be bypassed by rotating IP addresses or API keys.

Related Vulnerabilities

Broken Authentication

high

throttlingapidenial-of-servicesecurity

Rate Limiting

Related Terms

Related Vulnerabilities

Report Vulnerabilities Faster with Vulnsy