docker icon indicating copy to clipboard operation
docker copied to clipboard

handleConnectionChange() in dockerd kills containerd 'silently' / without notification to user

Open rh-ulrich-o opened this issue 6 years ago • 0 comments

The handleConnectionChange() function in dockerd monitors the health of containerd by sending grpc 'HealthCheckRequest' messages periodically. If containerd is unresponsive to such messages during a certain amount of time, dockerd initiates a restart of containerd. The amount of time is determined by two hard-coded constants: containerdHealthCheckTimeout and maxConnectionRetryCount. A simple experiment (sending a STOP signal to containerd) demonstrates that dockerd kills containerd after only a few seconds of unresponsiveness.

The issue is that handleConnectionChange() kills containerd 'silently', i.e. this leaves the user with no clue at all as to what happened. The monitorConnection() function in upstream moby code includes some useful improvements in this regard.

  • It logs an informative message "killing and restarting containerd".
  • It tries to obtain a goroutine stack dump of containerd via SIGUSR1.

Related snippet of code from monitorConnection():

if system.IsProcessAlive(r.daemonPid) {
        r.logger.WithField("pid", r.daemonPid).Info("killing and restarting containerd")
        // Try to get a stack trace
        syscall.Kill(r.daemonPid, syscall.SIGUSR1)
        <-time.After(100 * time.Millisecond)
        system.KillProcess(r.daemonPid)
}

Please consider a back-port of these improvements.

rh-ulrich-o avatar Jun 25 '18 13:06 rh-ulrich-o