coreos-assembler icon indicating copy to clipboard operation
coreos-assembler copied to clipboard

switch from 9p to virtiofs

Open cgwalters opened this issue 5 years ago • 9 comments

qemu5 is in f33, so once we rebase coreos-assembler to that we can port our usage of 9p to virtiofs which will solve a bunch of problems:

  • access to symlinks
  • works with RHEL8 (I think, it's planned at least)

cgwalters avatar Oct 28 '20 16:10 cgwalters

We should investigate reusing (ideally) or stealing (less ideal) code from https://github.com/kata-containers/kata-containers/blob/d22c7cf00b30a6791288dbf911627a78872f78ff/src/runtime/virtcontainers/virtiofsd.go

cgwalters avatar Dec 11 '20 21:12 cgwalters

I experimented with this briefly last week (though it wasn't because of this issue). Here's the hack I was using:

diff --git a/src/cmdlib.sh b/src/cmdlib.sh
index 4ad8645ba..1a5eac2ad 100755
--- a/src/cmdlib.sh
+++ b/src/cmdlib.sh
@@ -634,12 +634,19 @@ EOF
     kola_args=(kola qemuexec -m "${COSA_SUPERMIN_MEMORY:-${memory_default}}" --auto-cpus -U --workdir none \
                --console-to-file "${runvm_console}")
 
+    sudo /usr/libexec/virtiofsd --socket-path=/var/run/vm001-vhost-fs.sock -o source="${workdir}" &
+    sleep 4
+    sudo chmod 777 /var/run/vm001-vhost-fs.sock
     base_qemu_args=(-drive 'if=none,id=root,format=raw,snapshot=on,file='"${vmbuilddir}"'/root,index=1' \
+                    -m 4G,maxmem=4G \
                     -device 'virtio-blk,drive=root'
                     -kernel "${vmbuilddir}/kernel" -initrd "${vmbuilddir}/initrd" \
                     -no-reboot -nodefaults \
                     -device virtio-serial \
-                    -virtfs 'local,id=workdir,path='"${workdir}"',security_model=none,mount_tag=workdir' \
+                    -chardev socket,id=char0,path=/var/run/vm001-vhost-fs.sock \
+                    -device vhost-user-fs-pci,chardev=char0,tag=workdir \
+                    -object memory-backend-memfd,id=mem,size=4G,share=on \
+                    -numa node,memdev=mem \
                     -append "root=/dev/vda console=${DEFAULT_TERMINAL} selinux=1 enforcing=0 autorelabel=1" \
                    )
 
diff --git a/src/supermin-init-prelude.sh b/src/supermin-init-prelude.sh
index d8f7b3106..888404f99 100644
--- a/src/supermin-init-prelude.sh
+++ b/src/supermin-init-prelude.sh
@@ -15,7 +15,7 @@ mount -t tmpfs tmpfs /dev/shm
 LANG=C /sbin/load_policy  -i
 
 # load kernel module for 9pnet_virtio for 9pfs mount
-/sbin/modprobe 9pnet_virtio
+/sbin/modprobe virtiofs
 
 # need fuse module for rofiles-fuse/bwrap during post scripts run
 /sbin/modprobe fuse
@@ -33,10 +33,8 @@ fi
 umask 002
 
 # set up workdir
-# For 9p mounts set msize to 100MiB
-# https://github.com/coreos/coreos-assembler/issues/2171
 mkdir -p "${workdir:?}"
-mount -t 9p -o rw,trans=virtio,version=9p2000.L,msize=10485760 workdir "${workdir}"
+mount -t virtiofs workdir "${workdir}"
 # These two invocations pair with virtfs setups for qemu in cmdlib.sh.  Keep them in sync.
 if [ -L "${workdir}"/src/config ]; then
     mkdir -p "$(readlink "${workdir}"/src/config)"

This was mostly lifted from the man page.

I think the biggest problem I see right now (at least from reading the man page and experience) is that it requires root (sudo). That's the only benefit of staying on 9p that I can see.

I didn't take the experiment further to see if I could get rid of the cache.qcow2 altogether, but I suspect maybe we can.

dustymabe avatar May 11 '22 01:05 dustymabe

I think the biggest problem I see right now (at least from reading the man page and experience) is that it requires root (sudo).

That may be true for virtiofsd today, but I don't think it's true for virtiofs in general.

cgwalters avatar Jun 02 '22 13:06 cgwalters

That may be true for virtiofsd today, but I don't think it's true for virtiofs in general.

Is there any other implementation that doesn't have that limitation or maybe an RFE for virtiofsd to support non-root?

dustymabe avatar Jun 02 '22 14:06 dustymabe

https://lore.kernel.org/all/[email protected]/T/

cgwalters avatar Jun 02 '22 14:06 cgwalters

Wow. Thanks for starting that discussion and linking to it here. It sounds like there is a short term path and a longer term path (a bit more work but more complete).

I'm excited for when it lands.

dustymabe avatar Jun 02 '22 15:06 dustymabe

cc https://lore.kernel.org/qemu-devel/[email protected]/T/#u

cgwalters avatar Sep 10 '22 15:09 cgwalters

After asking around I was advised that my initial email should have gone to the virtiofs list. We had a big round of discussion and the result seems to be a commitment from that team to add openat2 support that should work for us.

https://lore.kernel.org/qemu-devel/[email protected]/T/#mf51ede15eb476e37f2aa5352df727fdcd5f702c6

cgwalters avatar Sep 30 '22 12:09 cgwalters

Nice.. Thanks for carrying forward this discussion! I'm guessing the already existing unprivileged rust virtiofsd that they mention in the mailing list thread doesn't work for us (yet) because of user namespaces in our build env?

dustymabe avatar Sep 30 '22 13:09 dustymabe

This is in progress over in https://github.com/coreos/coreos-assembler/pull/3428 just blocked on an upstream PR to be put in a release.

dustymabe avatar May 09 '23 20:05 dustymabe

Looks like a new release of virtiofsd is in testing: https://bodhi.fedoraproject.org/updates/FEDORA-2023-5167ce8181

dustymabe avatar Jul 23 '23 04:07 dustymabe