libnetwork icon indicating copy to clipboard operation
libnetwork copied to clipboard

docker-proxy cannot exit after send SIGINT

Open mingfukuang opened this issue 6 years ago • 7 comments

Signed-off-by: mingfukuang [email protected]

- What I did when stop or delete one container with docker-proxy, stop container will call Unmap() to release port , func Unmap() need to acquire pm.lock.Lock() firstly , then stop docker-proxy , and then pm.lock.UnLock() . but some situation(cannot be reproduced), stop docker-proxy will hang on :

goroutine [78795220]
 0  0x000000000065e5c5 in syscall.Syscall6
    at /usr/local/go/src/syscall/asm_linux_amd64.s:45
 1  0x00000000004b914c in os.(*Process).blockUntilWaitable
    at /usr/local/go/src/os/wait_waitid.go:28
 2  0x00000000004b28eb in os.(*Process).wait
    at /usr/local/go/src/os/exec_unix.go:22
 3  0x00000000004b0e6b in os.(*Process).Wait
    at /usr/local/go/src/os/doc.go:49
 4  0x000000000094710d in os/exec.(*Cmd).Wait
    at /usr/local/go/src/os/exec/exec.go:434
 5  0x0000000000daadd6 in github.com/docker/libnetwork/portmapper.(*proxyCommand).Stop
    at /usr/src/debug/docker-engine/vendor/src/github.com/docker/libnetwork/portmapper/proxy.go:98
 6  0x0000000000da98a3 in github.com/docker/libnetwork/portmapper.(*PortMapper).Unmap
    at /usr/src/debug/docker-engine/vendor/src/github.com/docker/libnetwork/portmapper/mapper.go:185
 7  0x000000000093144f in github.com/docker/libnetwork/drivers/bridge.(*bridgeNetwork).releasePort
    at /usr/src/debug/docker-engine/vendor/src/github.com/docker/libnetwork/drivers/bridge/port_mapping.go:133
 8  0x0000000000930e91 in github.com/docker/libnetwork/drivers/bridge.(*bridgeNetwork).releasePortsInternal
    at /usr/src/debug/docker-engine/vendor/src/github.com/docker/libnetwork/drivers/bridge/port_mapping.go:113
 9  0x0000000000930d1e in github.com/docker/libnetwork/drivers/bridge.(*bridgeNetwork).releasePorts
    at /usr/src/debug/docker-engine/vendor/src/github.com/docker/libnetwork/drivers/bridge/port_mapping.go:105
10  0x0000000000926907 in github.com/docker/libnetwork/drivers/bridge.(*driver).RevokeExternalConnectivity
    at /usr/src/debug/docker-engine/vendor/src/github.com/docker/libnetwork/drivers/bridge/bridge.go:1288
11  0x0000000000805bbc in github.com/docker/libnetwork.(*endpoint).sbLeave
    at /usr/src/debug/docker-engine/vendor/src/github.com/docker/libnetwork/endpoint.go:688
12  0x00000000008049eb in github.com/docker/libnetwork.(*endpoint).Leave
    at /usr/src/debug/docker-engine/vendor/src/github.com/docker/libnetwork/endpoint.go:644
13  0x00000000008250c2 in github.com/docker/libnetwork.(*sandbox).delete
    at /usr/src/debug/docker-engine/vendor/src/github.com/docker/libnetwork/sandbox.go:227
14  0x0000000000824d20 in github.com/docker/libnetwork.(*sandbox).Delete
    at /usr/src/debug/docker-engine/vendor/src/github.com/docker/libnetwork/sandbox.go:188
15  0x000000000053d872 in github.com/docker/docker/daemon.(*Daemon).releaseNetwork
    at /usr/src/debug/docker-engine/.gopath/src/github.com/docker/docker/daemon/container_operations.go:808
16  0x000000000058cb6a in github.com/docker/docker/daemon.(*Daemon).Cleanup
    at /usr/src/debug/docker-engine/.gopath/src/github.com/docker/docker/daemon/start.go:197
17  0x000000000057b147 in github.com/docker/docker/daemon.(*Daemon).StateChanged
    at /usr/src/debug/docker-engine/.gopath/src/github.com/docker/docker/daemon/monitor.go:64
18  0x00000000005c5886 in github.com/docker/docker/libcontainerd.(*container).handleEvent.func1
    at /usr/src/debug/docker-engine/.gopath/src/github.com/docker/docker/libcontainerd/container_linux.go:224
19  0x00000000005c5da0 in github.com/docker/docker/libcontainerd.(*queue).append.func1
    at /usr/src/debug/docker-engine/.gopath/src/github.com/docker/docker/libcontainerd/queue_linux.go:26  

I have the whole stack file ,but it too big (89M), so I did not put here.

corresponding code as followed:

func (p *proxyCommand) Stop() error {
	if p.cmd.Process != nil {
		if err := p.cmd.Process.Signal(os.Interrupt); err != nil {
			return err
		}
		return p.cmd.Wait() //some situation, docker-proxy cannot quit, and hang on at here.
	}
	return nil
}

once one container cann't stop docker-proxy , above mentioned pm.lock cann't be released, so this container cann’t be stopped , Also all operation of this container could be hang on.
Furthermore , if other containers enter stop or delete process, those containers also need to acquire the global pm.lock, and get stuck . As result, other operations to those containers also hang on , such as docker inspect, docker exec ,etc.

  • How I did when above mentioned situation happen, adding protective measures to fix the problem of the global lock(pm.lock)cann't being released.

mingfukuang avatar Jul 19 '19 09:07 mingfukuang