Loading
A rule that limits how many requests a user can make in a given time — like a bouncer who only lets a certain number of people into the club per hour.
A server-side technique that restricts the number of requests a client can make within a time window to prevent abuse, protect resources, and ensure fair usage. Common algorithms include fixed window, sliding window, and token bucket.
Rate limiting algorithms trade accuracy, memory, and fairness: fixed window is simple but allows burst at window boundaries; sliding window log is precise but memory-intensive; token bucket allows controlled bursts with a steady refill rate. Distributed rate limiting requires a shared store (Redis INCR with TTL, or Lua scripts for atomicity). Response headers (RateLimit-Limit, RateLimit-Remaining, RateLimit-Reset per IETF draft) communicate quota state to clients.