Rate Limiting

backendPhase 5also: API rate limit, throttling

Beginner

Plain English

A rule that limits how many requests a user can make in a given time — like a bouncer who only lets a certain number of people into the club per hour.

Intermediate

Technical

A server-side technique that restricts the number of requests a client can make within a time window to prevent abuse, protect resources, and ensure fair usage. Common algorithms include fixed window, sliding window, and token bucket.

Advanced

Spec-level

Rate limiting algorithms trade accuracy, memory, and fairness: fixed window is simple but allows burst at window boundaries; sliding window log is precise but memory-intensive; token bucket allows controlled bursts with a steady refill rate. Distributed rate limiting requires a shared store (Redis INCR with TTL, or Lua scripts for atomicity). Response headers (RateLimit-Limit, RateLimit-Remaining, RateLimit-Reset per IETF draft) communicate quota state to clients.

Rate Limiting

Beginner

Intermediate

Advanced

See also

Rate Limiting

Beginner

Intermediate

Advanced

See also