recall

← recall

admission control pattern

At the entry point of a service, decide whether to accept a request based on current load, identity, or priority — before processing it. Better to reject 5% of requests early than fail 100% of them slowly under overload. The foundation that rate limiting, load shedding, and auth all sit on.

At the entry point of a service, decide whether to accept a request based on current load, identity, or priority — before processing it. Better to reject 5% of requests early than fail 100% of them slowly under overload. The foundation that rate limiting, load shedding, and auth all sit on.

symptoms

  • service crashes under load it could have rejected
  • tail latency on rejects rivals successes
  • cost paid for requests that don't complete

causes

  • no edge layer
  • health based on success only, not capacity

fixes

  • concurrency cap with reject-when-full
  • rate limiting per identity
  • priority lanes (premium / free) at admission

you might say

  • admission control
  • gate at the edge

related

topics: resilience, scaling

references: