for-linux
for-linux copied to clipboard
docker ps shows containers which are dead already
- [x] This is a bug report
- [ ] This is a feature request
- [ ] I searched existing issues before opening this one
Expected behavior
docker ps should not show containers which have already been killed.
Actual behavior
docker ps shows containers whose PID is already killed. This gets resolved if docker service is restarted.
Steps to reproduce the behavior
We are still not sure when/how this starts to happen but what we see is as follows:
docker ps -a | grep masked-name-import-at-6b65b6ddbd-9nknw
148002af7455 7081d715e0ad "/usr/local/bin/ph..." 4 hours ago Up 4 hours k8s_masked-name-import_masked-name-import-at-6b65b6ddbd-9nknw_default_7b042ec0-1424-11e9-ae7a-0a2f2f061794_39
As we can see docker daemon shows a container is running. Now when we run docker inspect, we see the following:
# docker inspect 148002af7455
[
{
"Id": "148002af7455e538a4b33fd40445f8579df673063b16111547cc300ad3de1242",
"Created": "2019-01-11T05:51:02.880386898Z",
"Path": "/usr/local/bin/php",
"Args": [
"/server/http/cli/index.php",
"--env=staging",
"--module=experiment",
"--controller=cli",
"--action=experiment-import-consumer"
],
"State": {
"Status": "running",
"Running": true,
"Paused": false,
"Restarting": false,
"OOMKilled": false,
"Dead": false,
"Pid": 15552,
"ExitCode": 0,
"Error": "",
"StartedAt": "2019-01-11T05:51:03.218747444Z",
"FinishedAt": "0001-01-01T00:00:00Z"
},
It can be seen that the container PID is 15552
But this PID does not exist any more
# ps -efa | grep 15552
root 14276 3946 0 10:25 pts/1 00:00:00 grep --color=auto 15552
root@ip-172-23-105-205:/proc# ls -la | grep 15552
root@ip-172-23-105-205:/proc#
So the docker daemon is reporting an incorrect status. Once the daemon is restarted, this behaviour is not seen any more.
Output of docker version:
Client:
Version: 17.03.2-ce
API version: 1.27
Go version: go1.7.5
Git commit: f5ec1e2
Built: Tue Jun 27 03:35:14 2017
OS/Arch: linux/amd64
Server:
Version: 17.03.2-ce
API version: 1.27 (minimum version 1.12)
Go version: go1.7.5
Git commit: f5ec1e2
Built: Tue Jun 27 03:35:14 2017
OS/Arch: linux/amd64
Experimental: false
Output of docker info:
Containers: 68
Running: 64
Paused: 0
Stopped: 4
Images: 69
Server Version: 17.03.2-ce
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 4ab9917febca54791c5f071a9d1f404867857fcc
runc version: 54296cf40ad8143b62dbcaa1d90e520a2136ddfe
init version: 949e6fa
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 4.4.0-1065-aws
Operating System: Ubuntu 16.04.5 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 15.67 GiB
Name: ip-172-23-105-205
ID: 4LRV:ZCHB:VDDU:TSGP:2P4Q:M7XE:QUY5:MHOA:NCMO:OIJ4:SPQP:7LQM
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
WARNING: No swap limit support
Additional environment details (AWS, VirtualBox, physical, etc.)
AWS
OS :
NAME="Ubuntu"
VERSION="16.04.5 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.5 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial
An additional bit of information: looking for container ID in dockerd logs we see the following messages:
Jan 11 06:51:04 ip-172-23-105-205 dockerd[22916]: time="2019-01-11T06:51:04.084376905Z" level=error msg="containerd: get exit status" error="containerd: process has not exited" id=148002af7455e538a4b33fd40445f8579df673063b16111547cc300ad3de1242 pid=6e9b64203251666eee89c25d1322c810e483366b4777ed4f23b36a80f57eca42 systemPid=11646
Jan 11 06:51:10 ip-172-23-105-205 dockerd[22916]: time="2019-01-11T06:51:10.499071121Z" level=warning msg="container kill failed because of 'container not found' or 'no such process': Cannot kill container 148002af7455e538a4b33fd40445f8579df673063b16111547cc300ad3de1242: rpc error: code = 2 desc = containerd: container not found"
Jan 11 06:51:40 ip-172-23-105-205 dockerd[22916]: time="2019-01-11T06:51:40.500079952Z" level=warning msg="container kill failed because of 'container not found' or 'no such process': Cannot kill container 148002af7455e538a4b33fd40445f8579df673063b16111547cc300ad3de1242: rpc error: code = 2 desc = containerd: container not found"
Jan 11 09:08:28 ip-172-23-105-205 dockerd[22916]: time="2019-01-11T09:08:28.483353612Z" level=warning msg="container kill failed because of 'container not found' or 'no such process': Cannot kill container 148002af7455e538a4b33fd40445f8579df673063b16111547cc300ad3de1242: rpc error: code = 2 desc = containerd: container not found"
Jan 11 09:08:58 ip-172-23-105-205 dockerd[22916]: time="2019-01-11T09:08:58.484795898Z" level=warning msg="container kill failed because of 'container not found' or 'no such process': Cannot kill container 148002af7455e538a4b33fd40445f8579df673063b16111547cc300ad3de1242: rpc error: code = 2 desc = containerd: container not found"
I have the same issue, how did you resolve it?
@nerdherdx since we were using kops, we changed our ec2 images from ubuntu xenial to ubuntu bionic and our docker version from 17.03 to 18.06
We have experienced the same thing on 17.03.2-ce.
@honnix , can you restart the docker daemon. It seems like the daemon is not able to update the state.
@VinayKumarKnol Yeah that did solve the problem but it happens quite often. Do we know whether 18.x got this things fixed?
@honnix We have not faced this problem on docker 18.x
@harshal-shah Good to know. Thank for the confirmation!
Hi @honnix , we experienced similar issue but our docker version is 18.09.3.
Still having the exact described problem on docker 18.09.9-ce on Amazon Linux.
Some more details:
The application crashes due to some errors.
Docker stays alive, appearing in docker ps and similar commands.
Running docker kill or similar returns no error, but container remains in docker ps etc.
Running docker restart has no effect.
Only way I've found to solve it is restart the host ( I guess maybe restarting dockerd would have done the job as well).
This is critical, since we are using docker's restart functionality to ensure availability, and since docker doesn't detect the service crashed, no restart occurs.
I'm facing this problem again in docker 19.03
Client:
Version: 19.03.6-ce
API version: 1.40
Go version: go1.13.4
Git commit: 369ce74
Built: Fri Mar 6 23:25:53 2020
OS/Arch: linux/amd64
Experimental: false
Server:
Engine:
Version: 19.03.6-ce
API version: 1.40 (minimum version 1.12)
Go version: go1.13.4
Git commit: 369ce74
Built: Fri Mar 6 23:26:25 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.3.2
GitCommit: ff48f57fc83a8c44cf4ad5d672424a98ba37ded6
runc:
Version: 1.0.0-rc10
GitCommit: dc9208a3303feef5b3839f4323d9beb36df0a9dd
docker-init:
Version: 0.18.0
GitCommit: fec3683
I am also facing the same problem
Client:
Debug Mode: false
Server:
Containers: 5
Running: 5
Paused: 0
Stopped: 0
Images: 15
Server Version: 19.03.8
Storage Driver: overlay2
Backing Filesystem: <unknown>
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: nvidia runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 7ad184331fa3e55e52b890ea95e65ba581ae3429
runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
init version: fec3683
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 5.3.0-42-generic
Operating System: Ubuntu 18.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 20
Total Memory: 125.5GiB
Name: 3XS-POC10900X
ID: ZG3T:WERA:JK57:Z5J6:45RP:XYGH:TRKU:BHVJ:UDYZ:BYEC:DYZ2:FQ4R
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
We experience the same problem with Azure, containerd version v1.2.6.
Client: Version: 3.0.7 API version: 1.40 Go version: go1.12.8 Git commit: 578ab52e Built: Wed Oct 2 20:59:32 2019 OS/Arch: linux/amd64 Experimental: false
Server: Engine: Version: 3.0.7 API version: 1.40 (minimum version 1.12) Go version: go1.12.8 Git commit: ed20165 Built: Wed Oct 2 18:42:30 2019 OS/Arch: linux/amd64 Experimental: false containerd: Version: v1.2.6 GitCommit: 894b81a4b802e4eb2a91d1ce216b8817763c29fb runc: Version: 1.0.0-rc8 GitCommit: 425e105d5a03fabd737a126ad93d62a9eeede87f docker-init: Version: 0.18.0 GitCommit: fec3683
We are too in our Centos7 local VM
Client: Docker Engine - Community Version: 19.03.12 API version: 1.40 Go version: go1.13.10 Git commit: 48a66213fe Built: Mon Jun 22 15:46:54 2020 OS/Arch: linux/amd64 Experimental: false
Server: Docker Engine - Community Engine: Version: 19.03.12 API version: 1.40 (minimum version 1.12) Go version: go1.13.10 Git commit: 48a66213fe Built: Mon Jun 22 15:45:28 2020 OS/Arch: linux/amd64 Experimental: false containerd: Version: 1.2.13 GitCommit: 7ad184331fa3e55e52b890ea95e65ba581ae3429 runc: Version: 1.0.0-rc10 GitCommit: dc9208a3303feef5b3839f4323d9beb36df0a9dd docker-init: Version: 0.18.0 GitCommit: fec3683
I've also encountered this problem.
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4c2199f77395 dcc6f1e61537 "/home/bin/start" About an hour ago Up About an hour nginx-le
405b41e6833f mysql:5.7 "docker-entrypoint.s…" 20 hours ago Up 20 hours mysql
docker kill 405b41e6833f
Error response from daemon: Cannot kill container: 405b41e6833f: Container 405b41e6833f85964da6e6265c50755f952240edb4c106cf1b9386889b180080 is not running
sudo service docker restart
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
405b41e6833f mysql:5.7 "docker-entrypoint.s…" 20 hours ago Up 3 seconds mysql
docker kill 405b41e6833f
405b41e6833f
docker -v Docker version 19.03.8, build afacb8b7f0
Running on ubuntu 20.04.
Same problem with Docker version 19.03.12-ce, build 48a66213fe on Archlinux (Kernel 5.8.3).
the same too , it's a containerd bug ? https://github.com/containerd/containerd/issues/4547
With loglevel=debug this lines show up every minute:
dockerd[656]: time="2020-09-10T22:34:47.190957445+02:00" level=debug msg="Calling POST /v1.40/containers/0e9ed449ca51959615a2d74e9c20951b5f769f783f8ad0257e5fdd2450e438ee/kill?signal=URG"
dockerd[656]: time="2020-09-10T22:34:47.191068491+02:00" level=debug msg="Sending kill signal 23 to container 0e9ed449ca51959615a2d74e9c20951b5f769f783f8ad0257e5fdd2450e438ee"
dockerd[656]: time="2020-09-10T22:34:48.198137278+02:00" level=debug msg="container kill failed because of 'container not found' or 'no such process'" action=kill container=0e9ed449ca51959615a2d74e9c20951b5f769f783f8ad0257e5fdd2450e438ee error="process already finished: not found"
Same problem with Docker version 18.09.2, build 6247962
I restart dockerd and containerd, but it happens after about 5~6 hours later.
root@support:~# docker inspect swinfostatistics | grep Pid "Pid": 17271, "PidMode": "", "PidsLimit": 0, root@support:~# ps -ef | grep 17271 root 59643 89870 0 07:52 pts/1 00:00:00 grep --color=auto 17271 root@support:~# docker -v Docker version 18.09.2, build 6247962
It on ubuntu linux, which is in vmware.
Linux support 4.4.0-116-generic #140-Ubuntu SMP Mon Feb 12 21:23:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Bump
similar issue as reported above. Docker version 19.03.13, build 4484c46d9d
docker rm "$(docker ps --all --quiet)"
results in (occasionally, not sure under what context):
Docker Error: No such container
The expectation is ps -a should show all existing containers
@zoombinis we are seeing the same issue as well with following docker version
docker version
Client:
Version: 19.03.6-ce
API version: 1.40
Go version: go1.13.4
Git commit: 369ce74
Built: Fri May 29 04:01:26 2020
OS/Arch: linux/amd64
Experimental: false
Server:
Engine:
Version: 19.03.6-ce
API version: 1.40 (minimum version 1.12)
Go version: go1.13.4
Git commit: 369ce74
Built: Fri May 29 04:01:57 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.4.0
GitCommit: 09814d48d50816305a8e6c1a4ae3e2bcc4ba725a
runc:
Version: 1.0.0-rc92
GitCommit: ff819c7e9184c13b7c2607fe6c30ae19403a7aff
docker-init:
Version: 0.18.0
GitCommit: fec3683
I'm not sure why this issue keeps getting closed, I am still seeing this in docker 20.10.14. I suspect this is probably a zombie process issue and docker is not properly keeping track of all the containers . I expect you'll see more of this kind of issue if the host is busy enough to reuse PIDs
i am also having this problem
I am experiencing the same issue.
Here is the version of Docker I am currently using:
Client:
Version: 20.10.23
API version: 1.41
Go version: go1.18.9
Git commit: 7155243
Built: Tue Apr 11 22:56:36 2023
OS/Arch: linux/amd64
Context: default
Experimental: true
Server:
Engine:
Version: 20.10.23
API version: 1.41 (minimum version 1.12)
Go version: go1.18.9
Git commit: 6051f14
Built: Tue Apr 11 22:57:17 2023
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.6.19
GitCommit: 1e1ea6e986c6c86565bc33d52e34b81b3e2bc71f
runc:
Version: 1.1.7
GitCommit: f19387a6bec4944c770f7668ab51c4348d9c2f38
docker-init:
Version: 0.19.0
GitCommit: de40ad0