request amplification pattern
One incoming request fans out into many internal calls. The user makes 1 request; your service makes 50. When traffic doubles, internal load multiplies — and the slowest dependency dominates p99.
One incoming request fans out into many internal calls. The user makes 1 request; your service makes 50. When traffic doubles, internal load multiplies — and the slowest dependency dominates p99.
symptoms
- internal RPS far higher than external RPS
- p99 worse than slowest dependency
- small traffic spikes cause large internal spikes
- dashboards on dependent services light up first
causes
- chatty service decomposition
- N+1 queries
- denormalized reads requiring multiple lookups
- unbatched parallel calls
fixes
- batch / coalesce calls
- cache fan-out responses
- denormalize at write time
- service-level SLOs that surface amplification
you might say
- this endpoint fans out to like 30 services
- we have a really high amplification factor
- one request lights up the whole graph