load shedding pattern
When the system can't handle all incoming load, drop or reject some requests early instead of trying everything and failing slowly. Better to serve 80% of users well than 100% of users badly.
When the system can't handle all incoming load, drop or reject some requests early instead of trying everything and failing slowly. Better to serve 80% of users well than 100% of users badly.
symptoms
- latency climbing while the system tries to serve everyone
- queues growing faster than they drain
- thread pools saturated
causes
- unexpected traffic spike
- downstream slowness amplifying upstream load
- no admission control at the edge
fixes
- admission control at the edge (rate limiter, queue length cap)
- priority lanes — drop low-priority work first
- circuit breakers downstream
- explicit error responses (429/503) instead of timeouts
you might say
- shed load
- we're shedding traffic
- drop the low-priority work