cc-oci-runtime icon indicating copy to clipboard operation
cc-oci-runtime copied to clipboard

docker rm -f reports 'Driver devicemapper failed to remove root filesystem'

Open grahamwhaley opened this issue 8 years ago • 26 comments

I got the following error (when trying to run https://github.com/01org/cc-oci-runtime/blob/master/tests/metrics/density/docker_memory_usage.sh.in under https://github.com/01org/ciao/tree/master/testutil/ciao-down (which runs Ubuntu under a VM)): Error response from daemon: Driver devicemapper failed to remove root filesystem 15d6062951f42e2a00206847c3846a7dbec38e31341d5640344a328515647c71: Device is Busy

After some digging I managed to re-create the issue by running:

 sudo docker run --name fred -tid busybox sh
 sudo docker run --name fred -tid busybox sh
 sudo docker rm -f fred john

Interesting points:

  • it appears you have to run docker as root to make this happen
  • you have to try to delete more than one container in a single command (individual docker rm -f calls work fine)
  • if you switch the storage driver on docker to overlay (adding --storage-driver=overlay to the systemd file), then the problem does not manifest
  • I tried this with runc, and it seemed to work

grahamwhaley avatar Apr 05 '17 16:04 grahamwhaley

@devimc fyi, don't know if you want to take this one or not

grahamwhaley avatar Apr 05 '17 16:04 grahamwhaley

Just to add some more info. re-confirmed with a clean ciao-down image, so that is using:

  • Docker version 1.12.1, build 23cf638
  • cc-oci-runtime version: 2.1.2 commit: 1778df4bd7660fdc72f19f2b6ef9b1451ebf12d5

grahamwhaley avatar Apr 05 '17 16:04 grahamwhaley

@grahamwhaley does this issue only occurs in ciao-down? if so could you report it to ciao-down maintainers ?

devimc avatar Apr 27 '17 13:04 devimc

@devimc good question - I dunno - but I should be in a position to try it out I think. I'll give it a try and get back here...

grahamwhaley avatar Apr 27 '17 14:04 grahamwhaley

@devimc And let me set my Clear Linux box up for devicemapper (if I can), as it is using overlay, so of course that worked....

grahamwhaley avatar Apr 27 '17 14:04 grahamwhaley

H @devimc Looks like I reproduced this on Clear Linux 14740 with cc-oci-runtime v2.1.1 by setting the --storage-driver=devicemapper in the systemd docker file (away from the Clear Linux default of overlay).

grahamwhaley avatar Apr 27 '17 15:04 grahamwhaley

@grahamwhaley thanks, I'm going to dig on this issue

devimc avatar Apr 27 '17 16:04 devimc

Query was 'does this happen with runc and devicemapper' - I will check...

grahamwhaley avatar May 03 '17 17:05 grahamwhaley

Update. Tested with Docker 1.2.1, CC 2.1.4 where the problem can be seen with --runtime=cor, and it does not happen with --runtime=runc.

grahamwhaley avatar May 04 '17 19:05 grahamwhaley

@grahamwhaley and @jcvenegas , I also noticed this issue Error response from daemon: Driver devicemapper failed to remove root filesystem ba63c5a01d9aac3e4b9371a81455dd041ae6fb2763f66e7bf230ced30a31ae4e: failed to remove device 85bfa90b9e766e6c8b6f0892fa03a45d4e233c166438718e6df23bbb6d29e0b8: Device is Busy

the configuration that I have is the following cc-oci-runtime version: 2.1.5 spec version: 1.0.0-rc1 commit: 1a255080da490126d21577b0cfbc43b993081a2b Docker version 17.04.0-ce, build 4845c56 Centos 7.0

That happen when I am trying to remove a container that has the status of exited CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES b924d366f2d6 gabyct/nuttcp "bash -c '/root/nu..." 5 minutes ago Exited (1) 4 minutes ago nuttcp-client

GabyCT avatar May 05 '17 20:05 GabyCT

May 05 08:23:38 localhost.localdomain dockerd[8666]: time="2017-05-05T08:23:38.188935544-04:00" level=debug msg="devmapper: DeleteDevice START(hash=a5529ebd17e292dfb629be5e01c25893e81c7ef35084772cba457a1c0e8062d9 syncDelete=false)" May 05 08:23:38 localhost.localdomain dockerd[8666]: time="2017-05-05T08:23:38.188954805-04:00" level=debug msg="devmapper: issueDiscard START(device: a5529ebd17e292dfb629be5e01c25893e81c7ef35084772cba457a1c0e8062d9)." May 05 08:23:38 localhost.localdomain dockerd[8666]: time="2017-05-05T08:23:38.188962788-04:00" level=debug msg="devmapper: activateDeviceIfNeeded(a5529ebd17e292dfb629be5e01c25893e81c7ef35084772cba457a1c0e8062d9)" May 05 08:23:38 localhost.localdomain dockerd[8666]: time="2017-05-05T08:23:38.262399052-04:00" level=debug msg="devmapper: issueDiscard END(device: a5529ebd17e292dfb629be5e01c25893e81c7ef35084772cba457a1c0e8062d9)." May 05 08:23:38 localhost.localdomain dockerd[8666]: time="2017-05-05T08:23:38.262440787-04:00" level=debug msg="devmapper: deactivateDevice START(a5529ebd17e292dfb629be5e01c25893e81c7ef35084772cba457a1c0e8062d9)" May 05 08:23:38 localhost.localdomain dockerd[8666]: time="2017-05-05T08:23:38.262506815-04:00" level=debug msg="devmapper: removeDevice START(docker-253:0-1837118-a5529ebd17e292dfb629be5e01c25893e81c7ef35084772cba457a1c0e8062d9)" May 05 08:23:38 localhost.localdomain dockerd[8666]: time="2017-05-05T08:23:38.283833555-04:00" level=debug msg="devmapper: removeDevice END(docker-253:0-1837118-a5529ebd17e292dfb629be5e01c25893e81c7ef35084772cba457a1c0e8062d9)" May 05 08:23:38 localhost.localdomain dockerd[8666]: time="2017-05-05T08:23:38.283857748-04:00" level=debug msg="devmapper: deactivateDevice END(a5529ebd17e292dfb629be5e01c25893e81c7ef35084772cba457a1c0e8062d9)" May 05 08:23:38 localhost.localdomain dockerd[8666]: time="2017-05-05T08:23:38.288824217-04:00" level=debug msg="devmapper: unregisterDevice(a5529ebd17e292dfb629be5e01c25893e81c7ef35084772cba457a1c0e8062d9)" May 05 08:23:38 localhost.localdomain dockerd[8666]: time="2017-05-05T08:23:38.290739633-04:00" level=debug msg="devmapper: DeleteDevice END(hash=a5529ebd17e292dfb629be5e01c25893e81c7ef35084772cba457a1c0e8062d9 syncDelete=false)" May 05 08:23:38 localhost.localdomain dockerd[8666]: time="2017-05-05T08:23:38.290884836-04:00" level=debug msg="devmapper: DeleteDevice START(hash=a5529ebd17e292dfb629be5e01c25893e81c7ef35084772cba457a1c0e8062d9-init syncDelete=false)" May 05 08:23:38 localhost.localdomain dockerd[8666]: time="2017-05-05T08:23:38.290902072-04:00" level=debug msg="devmapper: issueDiscard START(device: a5529ebd17e292dfb629be5e01c25893e81c7ef35084772cba457a1c0e8062d9-init)." May 05 08:23:38 localhost.localdomain dockerd[8666]: time="2017-05-05T08:23:38.290913320-04:00" level=debug msg="devmapper: activateDeviceIfNeeded(a5529ebd17e292dfb629be5e01c25893e81c7ef35084772cba457a1c0e8062d9-init)" May 05 08:23:38 localhost.localdomain dockerd[8666]: time="2017-05-05T08:23:38.365751216-04:00" level=debug msg="devmapper: issueDiscard END(device: a5529ebd17e292dfb629be5e01c25893e81c7ef35084772cba457a1c0e8062d9-init)." May 05 08:23:38 localhost.localdomain dockerd[8666]: time="2017-05-05T08:23:38.365789624-04:00" level=debug msg="devmapper: deactivateDevice START(a5529ebd17e292dfb629be5e01c25893e81c7ef35084772cba457a1c0e8062d9-init)" May 05 08:23:38 localhost.localdomain dockerd[8666]: time="2017-05-05T08:23:38.365889762-04:00" level=debug msg="devmapper: removeDevice START(docker-253:0-1837118-a5529ebd17e292dfb629be5e01c25893e81c7ef35084772cba457a1c0e8062d9-init)" May 05 08:23:38 localhost.localdomain dockerd[8666]: time="2017-05-05T08:23:38.389828431-04:00" level=debug msg="devmapper: removeDevice END(docker-253:0-1837118-a5529ebd17e292dfb629be5e01c25893e81c7ef35084772cba457a1c0e8062d9-init)" May 05 08:23:38 localhost.localdomain dockerd[8666]: time="2017-05-05T08:23:38.389866955-04:00" level=debug msg="devmapper: deactivateDevice END(a5529ebd17e292dfb629be5e01c25893e81c7ef35084772cba457a1c0e8062d9-init)" May 05 08:23:38 localhost.localdomain dockerd[8666]: time="2017-05-05T08:23:38.396662825-04:00" level=debug msg="devmapper: unregisterDevice(a5529ebd17e292dfb629be5e01c25893e81c7ef35084772cba457a1c0e8062d9-init)" May 05 08:23:38 localhost.localdomain dockerd[8666]: time="2017-05-05T08:23:38.398666828-04:00" level=debug msg="devmapper: DeleteDevice END(hash=a5529ebd17e292dfb629be5e01c25893e81c7ef35084772cba457a1c0e8062d9-init syncDelete=false)" May 05 08:24:23 localhost.localdomain dockerd[8666]: time="2017-05-05T08:24:23.957736061-04:00" level=debug msg="clean 2 unused exec commands" May 05 08:24:52 localhost.localdomain dockerd[8666]: time="2017-05-05T08:24:52.818157529-04:00" level=debug msg="Calling GET /_ping"

GabyCT avatar May 05 '17 20:05 GabyCT

@jcvenegas even using the overlay option in centos 7, the error continues

GabyCT avatar May 08 '17 15:05 GabyCT

@GabyCT Still same message : "Error response from daemon: Driver devicemapper failed to remove root filesystem ID: Device is Busy" ?

jcvenegas avatar May 08 '17 16:05 jcvenegas

@jcvenegas yes, the same error

GabyCT avatar May 08 '17 16:05 GabyCT

@GabyCT thanks for confirm , would not expect to see device-mapper issues with overlay. I'll take a look

jcvenegas avatar May 08 '17 16:05 jcvenegas

@jcvenegas should we increase to P1 this bug? currently this is affecting pnp network measurements

GabyCT avatar May 08 '17 22:05 GabyCT

by the way, I tested using runc and the problem is not there even using device mapper or overlay

GabyCT avatar May 08 '17 22:05 GabyCT

@GabyCT if this is blocker for the CI yes, or skip the new test that are triggering the issue

jcvenegas avatar May 09 '17 13:05 jcvenegas

@jcvenegas the issue is that is not a particular test, the problem is present when we are running all the tests

GabyCT avatar May 09 '17 14:05 GabyCT

@GabyCT what is the last known version that worked for CentOS ?

jcvenegas avatar May 09 '17 16:05 jcvenegas

I tried to reproduce the issue something:

  • I could not reproduce it using cc-oci-runtime.sh ( probably a race condition ? )
  • I only can reproduce the issue the first time I run docker ( /var/lib/docker clean )

jcvenegas avatar May 09 '17 16:05 jcvenegas

@grahamwhaley @GabyCT I can reproduce the steps from @grahamwhaley description only with docker 1.12 using docker 17.05 and cc-oci-runtiem 2.1.8 I cannot reproduce the issue anymore.

jcvenegas avatar May 17 '17 14:05 jcvenegas

@jcvenegas OK. I normally see this when running the runtime density metrics tests as they 'rm' a list of containers all in one go, and have to be run as root. Sooo, I'm happy to leave this here open for a bit and see if this either does or does not re-occur when we start running those tests on 17.05 and above. Failing that, the real path to close this would be a bisect on docker versions or a trawl of the docker history/changes I suspect - not enticing...

grahamwhaley avatar May 30 '17 17:05 grahamwhaley

Hi @devimc @GabyCT @jcvenegas - is this still a problem we see? did/do we have any more info on why etc.? I only ask as it is still listed over in https://github.com/clearcontainers/tests/pull/381 https://github.com/clearcontainers/tests/pull/381/files#diff-51e815c00956126952247d59c8a047bdR76

for instance...

grahamwhaley avatar Aug 15 '17 19:08 grahamwhaley

@grahamwhaley this is not present in CC 3.0 with the new agent, however, in CC 2.x is still happening

GabyCT avatar Aug 16 '17 21:08 GabyCT

thanks @GabyCT OK, as the tests in https://github.com/clearcontainers/tests/tree/master/metrics should work for both CC3.x and CC2.x, I think we'll leave the warnings in there for the minute, and pointing at the CC2.x repo to indicate they are CC2.x related.

grahamwhaley avatar Aug 17 '17 08:08 grahamwhaley