admission control pattern
At the entry point of a service, decide whether to accept a request based on current load, identity, or priority — before processing it. Better to reject 5% of requests early than fail 100% of them slowly under overload. The foundation that rate limiting, load shedding, and auth all sit on.
At the entry point of a service, decide whether to accept a request based on current load, identity, or priority — before processing it. Better to reject 5% of requests early than fail 100% of them slowly under overload. The foundation that rate limiting, load shedding, and auth all sit on.
symptoms
- service crashes under load it could have rejected
- tail latency on rejects rivals successes
- cost paid for requests that don't complete
causes
- no edge layer
- health based on success only, not capacity
fixes
- concurrency cap with reject-when-full
- rate limiting per identity
- priority lanes (premium / free) at admission
you might say
- admission control
- gate at the edge