multus-cni
multus-cni copied to clipboard
Thick plugin graceful termination
This PR introduces graceful shutdown functionality to the Multus daemon by adding a /readyz
endpoint alongside the existing /healthz
. The /readyz endpoint starts returning 500 once a SIGTERM is received, indicating the daemon is in shutdown mode. During this time, CNI requests can still be processed for a short window. The daemonset configs have been updated to increase terminationGracePeriodSeconds
from 10 to 30 seconds, ensuring we have a bit more time for these clean shutdowns.
This addresses a race condition during pod transitions where the readiness check might return true, but a subsequent CNI request could fail if the daemon shuts down too quickly. By introducing the /readyz endpoint and delaying the shutdown, we can handle ongoing CNI requests more gracefully, reducing the risk of disruptions during critical transitions.
Major thanks to @deads2k for the find, identification, fix, and of course, the explanations. Appreciate it.