recall

← recall

rate limiting pattern

Limit how many requests a client (IP, user, API key) can make per time window. Drops or queues excess. Protects downstream services from greedy or buggy callers.

Limit how many requests a client (IP, user, API key) can make per time window. Drops or queues excess. Protects downstream services from greedy or buggy callers.

symptoms

  • one client overwhelming a service
  • noisy neighbor effects
  • abuse / scraping degrading the service for everyone

causes

  • no admission control at the API boundary
  • per-tier capacity not enforced

fixes

  • token bucket or leaky bucket per identity
  • 429 with Retry-After on rejection
  • tiered limits per plan / per IP / global

you might say

  • rate limit them
  • they're hitting the limit
  • throttled

related

aliases: throttling

topics: api-design, resilience

references: