docker
docker copied to clipboard
handleConnectionChange() in dockerd kills containerd 'silently' / without notification to user
The handleConnectionChange() function in dockerd monitors the health of containerd by sending grpc 'HealthCheckRequest' messages periodically. If containerd is unresponsive to such messages during a certain amount of time, dockerd initiates a restart of containerd. The amount of time is determined by two hard-coded constants: containerdHealthCheckTimeout and maxConnectionRetryCount. A simple experiment (sending a STOP signal to containerd) demonstrates that dockerd kills containerd after only a few seconds of unresponsiveness.
The issue is that handleConnectionChange() kills containerd 'silently', i.e. this leaves the user with no clue at all as to what happened. The monitorConnection() function in upstream moby code includes some useful improvements in this regard.
- It logs an informative message "killing and restarting containerd".
- It tries to obtain a goroutine stack dump of containerd via SIGUSR1.
Related snippet of code from monitorConnection():
if system.IsProcessAlive(r.daemonPid) {
r.logger.WithField("pid", r.daemonPid).Info("killing and restarting containerd")
// Try to get a stack trace
syscall.Kill(r.daemonPid, syscall.SIGUSR1)
<-time.After(100 * time.Millisecond)
system.KillProcess(r.daemonPid)
}
Please consider a back-port of these improvements.