recall

← recall

graceful shutdown pattern

On SIGTERM, stop accepting new requests, finish in-flight work, close connections cleanly, then exit. Subtle — getting it right requires LB health-check coordination so traffic stops arriving, request-deadline awareness so you don't hold up the deploy, and process-supervisor patience (k8s terminationGracePeriodSeconds, etc.).

On SIGTERM, stop accepting new requests, finish in-flight work, close connections cleanly, then exit. Subtle — getting it right requires LB health-check coordination so traffic stops arriving, request-deadline awareness so you don't hold up the deploy, and process-supervisor patience (k8s terminationGracePeriodSeconds, etc.).

symptoms

  • request errors spike on every deploy
  • connections reset mid-request
  • in-flight work lost on rollout

causes

  • SIGTERM kills the process immediately
  • LB still routes to dying instance
  • no preStop drain

fixes

  • preStop / pre-shutdown hook for drain
  • fail health check first to deregister from LB
  • wait until in-flight requests finish (with cap)
  • tune terminationGracePeriodSeconds

you might say

  • drain
  • graceful shutdown
  • preStop hook

related

topics: operations, deployment

references: