gray failure term
partial/intermittent failure that looks different from different observers
The classic pattern: monitoring says everything's fine, but customers see errors. Different observers (LB, app, dependency) disagree about whether a node is healthy. Hardest failure mode to debug.