runtime icon indicating copy to clipboard operation
runtime copied to clipboard

Add Limitation: overlay graph driver + linux kernel 4.14+

Open mcastelino opened this issue 8 years ago • 27 comments

Description of problem

When using Clear Containers with overlay graph driver with linux kernel 4.14... causes multiple issues. The file system accesses may hang. The system may report out of memory.

This seems to be an issues with overlay+9p+kernel 4.14. Until this issue has been fixed, Clear Container should not be run on a host with 4.14+ kernel and overlay graph driver.

mcastelino avatar Nov 17 '17 22:11 mcastelino

This issue is also seen with overlay2 on Clear Linux

gtkramer avatar Nov 22 '17 21:11 gtkramer

@amshinde Not able to see this issue with a plain qemu (2.9, 2.10 from qemu github) + kernel (2.14, 2.14-rc8), with sharing of tmp folder with lots of symlinks (host "etc" folder copied with links preserved, onto shared folder). Before running any ls command, from root prompt of the guest, can you issue: "tail -f /var/log/{messages,kernel,dmesg,syslog} &" Then issue ls -l /root. Hopefully this could capture some kernel logs.

rarindam avatar Nov 22 '17 23:11 rarindam

@rarindam I had tried this with qemu 2.7.1. I had seen no messages of interest in dmesg. Is the folder you are sharing using overlay file system? I can try this with qemu 2.9 and see if I can reproduce.

amshinde avatar Nov 22 '17 23:11 amshinde

@rarindam I guess that you're sharing without relying on overlay, right ? The issue only happens if the shared filesystem relies on overlay.

sboeuf avatar Nov 22 '17 23:11 sboeuf

@sboeuf Yes exactly. I am not using docker. This is plain old Ubuntu host + qemu vm + Ubuntu guest with custom built kernel. The shared folder is copy of my host /etc with symlinks preserved.

rarindam avatar Nov 22 '17 23:11 rarindam

@rarindam well in that case, no issue for us either (even with qemu 2.7). Could you try to create an overlay fs, pass this to your VM, and try again ? As @amshinde mentioned, I am curious if this is reproducible with qemu >= 2.9

sboeuf avatar Nov 22 '17 23:11 sboeuf

Is there anyway I can request a bzip2 file with the overlay filesystem which any of you are using? Thanks ahead.

rarindam avatar Nov 22 '17 23:11 rarindam

@rarindam I just sent you a zip file. If you have docker installed you can get an overlay rootfs using this

# create the rootfs directory
mkdir rootfs

# export busybox via Docker into the rootfs directory
docker export $(docker create busybox) | tar -C rootfs -xvf -

amshinde avatar Nov 22 '17 23:11 amshinde

Just tried this with qemu 2.9. I see the same behaviour.

amshinde avatar Nov 23 '17 00:11 amshinde

@amshinde thanks for the confirmation

sboeuf avatar Nov 23 '17 00:11 sboeuf

The first commit which shows the issue is https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b5efccbe0a12a9d5c65c1a60b4270837c7fdb900

miguelinux avatar Nov 24 '17 18:11 miguelinux

@rarindam, @miguelinux has found the commit that changes how overlayfs functions which impacts QEMU. I can grant you access to a system that has this issue for debugging.

gtkramer avatar Nov 27 '17 15:11 gtkramer

@gtkramer @rarindam my bad that one is the last good one, the first bad is https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4edb83bb1041e2f946ce36ea93f6bcd06d984bf4

miguelinux avatar Nov 27 '17 18:11 miguelinux

Hi Archana, I unzipped the rootfs, and mounted the folder onto a virtual machine after booting. I chrooted to the mounted folder, and issues ls –l commands on the different folders. I don’t see a crash. I am running qemu-2.9.0 (from https://github.com/qemu/qemu.git, not qemu_lite), and Kernel 4.14, and I don’t see this issue. I am not much aware of the overlay system, so cannot pinpoint whether its due to overlay or not. But certainly seems like a docker specific issue. Arindam

From: Archana Shinde [mailto:[email protected]] Sent: Wednesday, November 22, 2017 4:36 PM To: clearcontainers/runtime [email protected] Cc: Roy, Arindam [email protected]; Mention [email protected] Subject: Re: [clearcontainers/runtime] Add Limitation: overlay graph driver + linux kernel 4.14+ (#820)

Just tried this with qemu 2.9. I see the same behaviour.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/clearcontainers/runtime/issues/820#issuecomment-346509152, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AbyHQ306IBEypU1gyYLbx1IUulLrs5tMks5s5L4GgaJpZM4QiucS.

rarindam avatar Nov 27 '17 19:11 rarindam

@rarindam I used qemu-2.9.0 under https://github.com/clearcontainers/qemu/tree/qemu-lite-v2.9.0 as well as qemu 2.10 that is shipped with Clear Linux. I was able to see the issue in both.

I realized now that the tar docker creates with the steps I provided you, does not create an overlay. You will need docker to create the overlay rootfs, as the rootfs is distributed accross several layers created by the overlay driver.

So, if you have docker installed, to verify that issue is seen with qemu, you will need to create a container first

$ docker run --runtime=runc -itd debian
$ mount | grep overlay  #grep for overlay rootfs created for the above container
overlay on /var/lib/docker/overlay2/c1cae180a5692979ebacd8f9c98f42fa37e5264aac2c2a7d0312cd08d9691cf4/merged type overlay 

Now use the above path "/var/lib/docker/overlay2/c1cae180a5692979ebacd8f9c98f42fa37e5264aac2c2a7d0312cd08d9691cf4/merged" to pass it to qemu using 9pfs

 sudo /usr/bin/qemu-system-x86_64 -machine pc,accel=kvm,kernel_irqchip,nvdimm -m 256,maxmem=512M,slots=2 -smp 2 -nodefaults -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -kernel /usr/share/clear-containers/vmlinuz.container  -append "reboot=k panic=1 rw tsc=reliable no_timer_check noreplace-smp root=/dev/pmem0p1 init=/usr/lib/systemd/systemd initcall_debug rootfstype=ext4 rootflags=dax,data=ordered dhcp rcupdate.rcu_expedited=1 clocksource=kvm-clock console=hvc0 single iommu=false pci=lastbus=0 nivablecore=20G debug" -device virtio-serial-pci,id=virtio-serial0 -chardev stdio,id=charconsole0 -device virtconsole,chardev=charconsole0,id=console0 -nographic -object memory-backend-file,id=mem0,share,mem-path=/usr/share/clear-containers/clear-containers.img,size=235929600 -device nvdimm,memdev=mem0,id=nv0 -no-reboot -device virtio-9p-pci,fsdev=workload9p,mount_tag=rootfs -fsdev local,id=workload9p,path=/var/lib/docker/overlay2/c1cae180a5692979ebacd8f9c98f42fa37e5264aac2c2a7d0312cd08d9691cf4/merged,security_model=none

Similarly, you can see the above issue by directly running a Clear Containers container:

$ docker run -it --runtime=cc-runtime debian
#ls /usr

I can provide you a machine with the docker setup, I think that will be the easiest way for you to reproduce the issue.

amshinde avatar Nov 27 '17 20:11 amshinde

@miguelinux Is there an bug open for the issue upstream?

amshinde avatar Nov 27 '17 20:11 amshinde

Hi Archana, Lets have a debug session over skype.

But before that, CC is passing 9p related parameter at boot time to the CC kernel. Please add debug=0x888 to that. For more reference: https://www.kernel.org/doc/Documentation/filesystems/9p.txt.

Let me know if you are able to do that. Arindam

From: Archana Shinde [mailto:[email protected]] Sent: Monday, November 27, 2017 12:13 PM To: clearcontainers/runtime [email protected] Cc: Roy, Arindam [email protected]; Mention [email protected] Subject: Re: [clearcontainers/runtime] Add Limitation: overlay graph driver + linux kernel 4.14+ (#820)

@rarindamhttps://github.com/rarindam I used qemu-2.9.0 under https://github.com/clearcontainers/qemu/tree/qemu-lite-v2.9.0 as well as qemu 2.10 that is shipped with Clear Linux (http://kojiclear.jf.intel.com/cgit/packages/qemu/tree/qemu.spec#n109). I was able to see the issue in both.

I realized now that the tar docker creates with the steps I provided you, does not create an overlay. You will need docker to create the overlay rootfs, as the rootfs is distributed accross several layers created by the overlay driver.

So, if you have docker installed, to verify that issue is seen with qemu, you will need to create a container first

$ docker run --runtime=runc -itd debian

$ mount | grep overlay #grep for overlay rootfs created for the above container

overlay on /var/lib/docker/overlay2/c1cae180a5692979ebacd8f9c98f42fa37e5264aac2c2a7d0312cd08d9691cf4/merged type overlay

Now use the above path "/var/lib/docker/overlay2/c1cae180a5692979ebacd8f9c98f42fa37e5264aac2c2a7d0312cd08d9691cf4/merged" to pass it to qemu using 9pfs

sudo /usr/bin/qemu-system-x86_64 -machine pc,accel=kvm,kernel_irqchip,nvdimm -m 256,maxmem=512M,slots=2 -smp 2 -nodefaults -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -kernel /usr/share/clear-containers/vmlinuz.container -append "reboot=k panic=1 rw tsc=reliable no_timer_check noreplace-smp root=/dev/pmem0p1 init=/usr/lib/systemd/systemd initcall_debug rootfstype=ext4 rootflags=dax,data=ordered dhcp rcupdate.rcu_expedited=1 clocksource=kvm-clock console=hvc0 single iommu=false pci=lastbus=0 nivablecore=20G debug" -device virtio-serial-pci,id=virtio-serial0 -chardev stdio,id=charconsole0 -device virtconsole,chardev=charconsole0,id=console0 -nographic -object memory-backend-file,id=mem0,share,mem-path=/usr/share/clear-containers/clear-containers.img,size=235929600 -device nvdimm,memdev=mem0,id=nv0 -no-reboot -device virtio-9p-pci,fsdev=workload9p,mount_tag=rootfs -fsdev local,id=workload9p,path=/var/lib/docker/overlay2/c1cae180a5692979ebacd8f9c98f42fa37e5264aac2c2a7d0312cd08d9691cf4/merged,security_model=none

Similarly, you can see the above issue by directly running a Clear Containers container:

$ docker run -it --runtime=cc-runtime debian

#ls /usr

I can provide you a machine with the docker setup, I think that will be the easiest way for you to reproduce the issue.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/clearcontainers/runtime/issues/820#issuecomment-347312520, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AbyHQwUYz4NUvt_3fYNoA84Y7nzLehhnks5s6xe2gaJpZM4QiucS.

rarindam avatar Nov 27 '17 20:11 rarindam

@amshinde have we tried different caching settings

mcastelino avatar Nov 27 '17 23:11 mcastelino

@mcastelino Just tried with cache=mmap and fscache and loose.Does not help.

amshinde avatar Nov 27 '17 23:11 amshinde

@rarindam That debug option needs to be passed while mounting the 9p share right? I was able to do that.

amshinde avatar Nov 27 '17 23:11 amshinde

@amshinde no issue open upstream.

Also I did a test with a VM and 9P + OverlayFS with QEMU 2.10.1 and 4.14 and no issues found.

Could it be our QEMU-lite?

miguelinux avatar Nov 27 '17 23:11 miguelinux

@amshinde I modified configuration.toml file to use QEMU 2.10.1.

The issue still there :-(

our Qemu-lite is not the guilty

miguelinux avatar Nov 28 '17 00:11 miguelinux

@rarindam Tried with 4.14.1 kernel in the guest, still see the issue :(

amshinde avatar Nov 28 '17 01:11 amshinde

Hi Yang, Pulling you in this email thread to keep everybody updated.

Using 4.14 as host, 4.14 as guest, qemu 2.9: Using Archana’s instructions created overlay and put the rootfs, and mount in guest…see the issue.

Enabling debug mode while mount from guest, shows any ls command on mounted FS goes to infinite loop. However, I read any specific file, directly under the mounted FS, no issues seems.

It also seems that the issue is at the top level mount folder only. If I issue “ls -l tmp/rootfs/bin” or “ls -l tmp/rootfs/etc” there is no hang.

But If I issue ls –l tmp/rootfs, there is infinite loop. It seems that OverlayFS is somehow sending its root level directory info different OR 9PFS was assuming something which is non standard.

Miguel also pointed out that the issue seems to happen after a commit in OverlayFS, between 4.13 and 4.14. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4edb83bb1041e2f946ce36ea93f6bcd06d984bf4

Thanks, Arindam

From: Archana Shinde [mailto:[email protected]] Sent: Monday, November 27, 2017 5:55 PM To: clearcontainers/runtime [email protected] Cc: Roy, Arindam [email protected]; Mention [email protected] Subject: Re: [clearcontainers/runtime] Add Limitation: overlay graph driver + linux kernel 4.14+ (#820)

@rarindamhttps://github.com/rarindam Tried with 4.14.1 kernel in the guest, still see the issue :(

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/clearcontainers/runtime/issues/820#issuecomment-347388402, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AbyHQzDWB8NGjVmpmIDH457SBu8Qe-Taks5s62gIgaJpZM4QiucS.

rarindam avatar Nov 29 '17 01:11 rarindam

@rarindam @amshinde @mcastelino

A patch to fix this issue in 4.14 is at https://marc.info/?l=linux-unionfs&m=151193367801434

This patch fix the issue on OverlayFS and 9P FS. BTW this issue was a bug in OverlayFS not from CC. The reported email is at: https://marc.info/?l=linux-unionfs&m=151192916900486&w=2

I already applied this patch to Clear Linux.

miguelinux avatar Nov 29 '17 17:11 miguelinux

Thanks for the fix Miguel. That was blazing fast for a kernel fix.

From: Miguel Bernal Marin [mailto:[email protected]] Sent: Wednesday, November 29, 2017 9:43 AM To: clearcontainers/runtime [email protected] Cc: Roy, Arindam [email protected]; Mention [email protected] Subject: Re: [clearcontainers/runtime] Add Limitation: overlay graph driver + linux kernel 4.14+ (#820)

@rarindamhttps://github.com/rarindam @amshindehttps://github.com/amshinde @mcastelinohttps://github.com/mcastelino

A patch to fix this issue in 4.14 is at https://marc.info/?l=linux-unionfs&m=151193367801434

This patch fix the issue on OverlayFS and 9P FS. BTW this issue was a bug in OverlayFS not from CC. The reported email is at: https://marc.info/?l=linux-unionfs&m=151192916900486&w=2

I already applied this patch to Clear Linux.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/clearcontainers/runtime/issues/820#issuecomment-347938896, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AbyHQzOo31BZ0Kx7jSlj8mO2tZGNwWFsks5s7ZeqgaJpZM4QiucS.

rarindam avatar Nov 29 '17 20:11 rarindam

@miguelinux Thanks for the fix!

amshinde avatar Nov 29 '17 21:11 amshinde