heartbeat pattern
Each node periodically sends a small message to a coordinator (or its peers) saying 'still alive.' Missing heartbeats trigger failure detection and re-election. Trade-off: too short and you flap; too long and dead nodes go unnoticed.
Each node periodically sends a small message to a coordinator (or its peers) saying 'still alive.' Missing heartbeats trigger failure detection and re-election. Trade-off: too short and you flap; too long and dead nodes go unnoticed.
symptoms
- need to detect node failures
- leadership leases need liveness signal
causes
- no out-of-band liveness mechanism
- TCP keepalive too coarse
fixes
- heartbeat every N ms
- failure threshold = K missed heartbeats
- phi-accrual for adaptive thresholds
you might say
- heartbeat interval
- phi-accrual
- no heartbeat in N seconds