recall

← recall

load shedding pattern

When the system can't handle all incoming load, drop or reject some requests early instead of trying everything and failing slowly. Better to serve 80% of users well than 100% of users badly.

When the system can't handle all incoming load, drop or reject some requests early instead of trying everything and failing slowly. Better to serve 80% of users well than 100% of users badly.

symptoms

  • latency climbing while the system tries to serve everyone
  • queues growing faster than they drain
  • thread pools saturated

causes

  • unexpected traffic spike
  • downstream slowness amplifying upstream load
  • no admission control at the edge

fixes

  • admission control at the edge (rate limiter, queue length cap)
  • priority lanes — drop low-priority work first
  • circuit breakers downstream
  • explicit error responses (429/503) instead of timeouts

you might say

  • shed load
  • we're shedding traffic
  • drop the low-priority work

related

aliases: drop traffic, throttle

topics: resilience, scaling

references: