lima icon indicating copy to clipboard operation
lima copied to clipboard

Filesystem Sharing

Open tarik02 opened this issue 4 years ago • 52 comments

Hello. I would like to give some ideas and advices about filesystem sharing.

So, since we are using QEMU, lets see which options do we have:

  • VirtioFS. Looking very cool, seems to have really good performance, but works only on Linux hosts. It is very optimised for using in virtual machines, it even uses DAX (direct access) for files, so there's no need to copy files over network, they're just in the shared RAM between VM and host.
  • VirtFS (9P). I've tried to use it, but it's incredible slow. Really. Using just git status in shared directory with middle size project takes at least half a minute. I would rather just place files in VM and access them via some remote file access protocol and use vscode with remote access (sad, but they're proprietary). I think that it is so slow because it is sync. Whenever you read some file, do stat call etc. you have to wait for this operation to end.
  • Just sync filesystem state between VM and host.
  • Write custom FUSE driver with asynchronous protocol and multithreading support. Theoretically this can be more performant than 9P, but not sure.

tarik02 avatar May 19 '21 07:05 tarik02

My current plan:

  • Step 1: Use 9p over virtio serial: https://github.com/AkihiroSuda/lima/issues/1

  • Step 2: Implement vsock support in QEMU for Darwin, and switch away from virtio serial to vsock

  • Step 3: Implement virtiofs support in QEMU for Darwin

AkihiroSuda avatar May 19 '21 09:05 AkihiroSuda

In parallel, we can consider supporting mutagen integration as well. IIUC it is used by Docker for Mac as well.

https://mutagen.io/

AkihiroSuda avatar May 19 '21 09:05 AkihiroSuda

Step 3: Implement virtiofs support in QEMU for Darwin

Do you think this is possible? It requires to be able to share mmap'ped memory pages between host and guest. It this is possible, it would be cool.

tarik02 avatar May 19 '21 11:05 tarik02

Just so you know: There is a patch for Darwin 9p support developed within the Nixpkgs project. I have adapted it to current QEMU from a patchset originally by Keno Fischer.

The people over at Nixpkgs would certainly be interested in any effort to accelerate file sharing using VirtioFS, which is why I’m now subscribed here.

mroi avatar May 21 '21 08:05 mroi

Thanks @mroi !

  • The patch seems proposed to the upstream in 2018, but what's the current status of upstreaming?
  • Is there / will there be a binary with the patch, that can be installed without nix? For both Intel and ARM. (Would it be possible to crosscompile ARM target on an Intel host?)

AkihiroSuda avatar May 21 '21 08:05 AkihiroSuda

@mroi Aside from 9p, do you know whether somebody is working on supporting vsock? For reimplementing the even notification system that is currently implemented by running ssh -L /run/user/<UID>/lima-guestagent.sock:ga.sock.

AkihiroSuda avatar May 21 '21 08:05 AkihiroSuda

I am preparing to propose the revised patch upstream. Have not had sufficient time to push this forward, but I hope to do this next week. It would certainly be good to have this upstream.

Regarding binaries: Nix has a binary cache, but the packages do not run standalone (they depend on other Nix packages). The patch however should apply to the vanilla QEMU sources, so you should be able to just recompile QEMU 6 with it.

I’m not aware of vsock developments. Unfortunately I’m not familiar with QEMU development internals at all. I just happened to have a need for 9p and worked on this patch.

mroi avatar May 21 '21 08:05 mroi

Thanks!

AkihiroSuda avatar May 21 '21 08:05 AkihiroSuda

The patch seems to have issues 😞 https://github.com/NixOS/nixpkgs/pull/122420#issuecomment-846365328

AkihiroSuda avatar May 22 '21 07:05 AkihiroSuda

Until we can get virtio-9p-pci for macOS hosts in the QEMU upstream, I was planning to use 9P over virtserial, but it doesn't seem easy as expected 😞

diff --git a/pkg/qemu/qemu.go b/pkg/qemu/qemu.go
index d9b0778..19f0f75 100644
--- a/pkg/qemu/qemu.go
+++ b/pkg/qemu/qemu.go
@@ -154,7 +154,7 @@ func Cmdline(cfg Config) (string, []string, error) {
        // Parallel
        args = append(args, "-parallel", "none")
 
-       // Serial
+       // Legacy Serial
        serialSock := filepath.Join(cfg.InstanceDir, "serial.sock")
        if err := os.RemoveAll(serialSock); err != nil {
                return "", nil, err
@@ -167,7 +167,17 @@ func Cmdline(cfg Config) (string, []string, error) {
        args = append(args, "-chardev", fmt.Sprintf("socket,id=%s,path=%s,server,nowait,logfile=%s", serialChardev, serialSock, serialLog))
        args = append(args, "-serial", "chardev:"+serialChardev)
 
-       // We also want to enable vsock and virtfs here, but QEMU does not support vsock and virtfs for macOS hosts
+       // vport for 9p
+       vportSock := filepath.Join(cfg.InstanceDir, "vport.sock")
+       if err := os.RemoveAll(vportSock); err != nil {
+               return "", nil, err
+       }
+       const vportChardev = "char-vport"
+       args = append(args, "-device", "virtio-serial")
+       args = append(args, "-chardev", fmt.Sprintf("socket,id=%s,path=%s,server,nowait", vportChardev, vportSock))
+       args = append(args, "-device", fmt.Sprintf("virtserialport,chardev=%s,name=lima", vportChardev))
+
+       // TODO: use virtio-9p-pci when QEMU supports it for macOS hosts
 
        // QEMU process
        args = append(args, "-name", "lima-"+cfg.Name)
package main

import (
        "fmt"
        "os"
        "syscall"
)

func main() {
        if err := xmain(); err != nil {
                fmt.Fprintln(os.Stderr, err)
                os.Exit(1)
        }
}

func xmain() error {
        devPath := "/dev/vport5p1"
        mntPath := "/mnt/foo"

        devFd, err := syscall.Open(devPath, syscall.O_RDWR|syscall.O_NONBLOCK, 0600)
        if err != nil {
                return err
        }
        return syscall.Mount(
                "",
                mntPath,
                "9p",
                0,
                fmt.Sprintf("trans=fd,rfdno=%d,wfdno=%d", devFd, devFd),
        )
}
# go run main.go
invalid argument
exit status 1

# dmesg
...
[27353.823239] kernel write not supported for file /vport5p1 (pid: 2243 comm: kworker/3:1)

The error is happening because the vport device somehow wires up both write_iter and write: https://github.com/torvalds/linux/blob/v5.11/fs/read_write.c#L540-L545

I guess I have to inject some pipe as a "shim" fd, or use ssh instead of virtserial.

AkihiroSuda avatar Jun 10 '21 09:06 AkihiroSuda

virtserial seems slower than ssh :(

virtserial

qemu flags: -device virtio-serial -chardev socket,id=foo,path=foo.sock,server,nowait -device virtserialport,chardev=foo.sock,name=foo

[host]$ socat file:urandom-1G unix-connect:foo.sock 
[guest]$ time sha256sum /dev/virtio-ports/foo
real    0m10.022s

ssh

[host] $ cat urandom-1G | lima time sha256sum -
real    0m7.673s

AkihiroSuda avatar Jun 23 '21 12:06 AkihiroSuda

virtserial seems slower than ssh

Given that we seem to be stuck with ssh, at least for now, did you check if selecting a different cipher makes any difference for throughput? It seems like we are using the default:

debug1: kex: server->client cipher: [email protected] MAC: <implicit> compression: none
debug1: kex: client->server cipher: [email protected] MAC: <implicit> compression: none

I remember reading that aes128-ctr was twice as fast as [email protected] when running on a CPU with AES instructions, but not sure how much that affects throughput.

I'm somewhat confused though, as even when I add -c aes128-ctr to the ssh command, it still ends up using [email protected], so I don't know how to configure it.

jandubois avatar Jun 23 '21 16:06 jandubois

Given that we seem to be stuck with ssh, at least for now, did you check if selecting a different cipher makes any difference for throughput?

Haven't looked into. Help wanted :pray:

AkihiroSuda avatar Jun 28 '21 09:06 AkihiroSuda

Btw QEMU Samba seems roughly two times faster than sshfs

[root@lima-archlinux ~]# time sha256sum /mnt/smb/ubuntu-21.04-desktop-amd64.iso 
fa95fb748b34d470a7cfa5e3c1c8fa1163e2dc340cd5a60f7ece9dc963ecdf88  /mnt/smb/ubuntu-21.04-desktop-amd64.iso

real    0m15.578s
user    0m7.054s
sys     0m6.259s

[root@lima-archlinux ~]# time sha256sum /tmp/lima/ubuntu-21.04-desktop-amd64.iso 
fa95fb748b34d470a7cfa5e3c1c8fa1163e2dc340cd5a60f7ece9dc963ecdf88  /tmp/lima/ubuntu-21.04-desktop-amd64.iso

real    0m29.707s
user    0m9.244s
sys     0m6.370s
diff --git a/pkg/qemu/qemu.go b/pkg/qemu/qemu.go
index 796c6ee..1524a24 100644
--- a/pkg/qemu/qemu.go
+++ b/pkg/qemu/qemu.go
@@ -193,7 +193,7 @@ func Cmdline(cfg Config) (string, []string, error) {
        // CIDR is intentionally hardcoded to 192.168.5.0/24, as each of QEMU has its own independent slirp network.
        // TODO: enable bridge (with sudo?)
        args = append(args, "-net", "nic,model=virtio")
-       args = append(args, "-net", fmt.Sprintf("user,net=192.168.5.0/24,hostfwd=tcp:127.0.0.1:%d-:22", y.SSH.LocalPort))
+       args = append(args, "-net", fmt.Sprintf("user,net=192.168.5.0/24,hostfwd=tcp:127.0.0.1:%d-:22,smb=/tmp/lima", y.SSH.LocalPort))
 
        // virtio-rng-pci acceralates starting up the OS, according to https://wiki.gentoo.org/wiki/QEMU/Options
        args = append(args, "-device", "virtio-rng-pci")

Needs https://github.com/Homebrew/homebrew-core/pull/80171 to be merged (which may take time)

AkihiroSuda avatar Jun 28 '21 09:06 AkihiroSuda

Samba is now available on homebrew (https://github.com/Homebrew/homebrew-core/blob/master/Formula/samba.rb), so I'm planning to replace sshfs with samba soon.

Samba will be executed with slirp_add_exec: https://github.com/qemu/qemu/blob/4cc10cae64c51e17844dc4358481c393d7bf1ed4/net/slirp.c#L964

So Samba will not listen on any actual TCP port.

cc @jandubois

AkihiroSuda avatar Jul 19 '21 08:07 AkihiroSuda

@AkihiroSuda That is awesome to hear. From the benchmarks i have seen so far, it is supposedly 2x faster with most workloads.

What would it take for us to support inotify events?

markomitranic avatar Jul 23 '21 22:07 markomitranic

It would be interesting to see some benchmarks between sshfs and virtio-9p-pci and nfs and cifs/smb...

The virtfs authors seemed convinced: https://www.kernel.org/doc/ols/2010/ols2010-pages-109-120.pdf

afbjorklund avatar Sep 11 '21 16:09 afbjorklund

On Linux QEMU using virtfs is a lot faster than sshfs and even 9p. If anything is slowed down on macOS due to kernel lockdown/security don't take it as given for Linux users!

-–› Please always consider Linux users in mind since lima itself becomes more and more a tool for cross-platform development (macOS, Linux, WSL/Windows). Especially k8s/k3s is really nyce scenerio! Don't make your decisions based on macOS only! Thank you!

Docker for Desktop on Windows uses 9p (they switched away from SMB which they used before).

Docs for linux: https://www.linux-kvm.org/page/9p_virtio

Aslo check out how wimpy uses hardware-acceleration/virt*-kernel modules in Linux w/ qemu v6.x (yup 6.0+!): https://github.com/wimpysworld/quickemu/blob/master/quickemu

markus-geiger avatar Nov 11 '21 10:11 markus-geiger

It's not Docker per se, but WSL2 that uses 9P for any WSL VM: https://devblogs.microsoft.com/commandline/a-deep-dive-into-how-wsl-allows-windows-to-access-linux-files/

I've been testing out WSL2 integration for other team members, and the filesystem performance is still not to a point where it's usable for large projects. For example, PHPStorm hangs and crashes trying to index a basic Drupal 9 codebase over the \\wsl$ share: https://github.com/drud/ddev/issues/3366

To date, mutagen (which is built into ddev) is giving our team the best experience. That work came about because Docker Desktop dropped their alpha mutagen support, and I think it would be really nice if support for it was at the lima (or colima) layer instead of at the next layer up.

I'm not sure it's worth the development effort to only get a relatively small performance improvement. If performance is any more than 20-50% slower than native IO, it's still too slow for most of our uses.

deviantintegral avatar Nov 11 '21 12:11 deviantintegral

What would be needed for integrating mutagen ? It seems like it should work out-of-the-box, over the ssh connection ?

For docker-machine, the support was more "hands-on". The user had to run the scp and rsync and sshfs themselves...

afbjorklund avatar Nov 14 '21 09:11 afbjorklund

Mutagen works, if you know your ssh and mutagen stuff.

I added an entry to ~/.ssh/config to disable host key check and specify the port, to make it easier to interact with rsync/mutagen:

HOST lima
Hostname localhost
Port 60022
StrictHostKeyChecking no
UserKnownHostsFile /dev/null

Then you can use a command like this to create a sync of the current working directory:

lima mkdir -p $PWD # create the needed folders
lima chmod 777 $PWD $(dirname $PWD) # mutagen will write a folder in the parent directory, so you need write access there too.
mutagen create \
  --sync-mode two-way-resolved \
  --stage-mode=neighboring \
  --ignore-vcs \
  --symlink-mode posix-raw \
  --default-file-mode 0666 \
  --default-directory-mode 0777 \
  --name "$(basename $PWD)" \
  $PWD lima:$PWD
mutagen monitor "$(basename $PWD)" # to watch your sync progress

Then stuff like lima nerdctl run -v $PWD:$PWD -w $PWD ubuntu ls will just run as expected. Just make sure your paths in the vm and on your machine are identical.

Nemo64 avatar Nov 15 '21 09:11 Nemo64

This could be a separate "mount" type, perhaps ? Similar to the current shorthand for setting up sshocker (reverse sshfs)

Didn't understand the part of the paths needing to the same on both sides, but will have to try it myself. (EDIT: it doesn't)

afbjorklund avatar Nov 15 '21 15:11 afbjorklund

It could but... I feel like it won't be intuitive or stable. There are issues with very large amounts of files (multiple projects) write privileges, the amount of inotify watches on linux and there is conflict resolution etc. I can manually enable sync's as needed. But i'm not too sure it will be a great fit as a solution in the config file.

But maybe i'm also not the average user ;)

Docker for mac had mutagen once as well https://github.com/docker/for-mac/issues/1592#issuecomment-678397258 There implementation was pretty good, it was integrated into the ui and showed sync progress etc. But you were on your own if it failed and it used a lot of resources when you synced larger directories.

Nemo64 avatar Nov 15 '21 16:11 Nemo64

At least you can use limactl show-ssh --format config default to set up the config, for host "lima-default" (etc).

To me it actually makes more sense to set up such syncs as needed, than to export your entire home directory...


Unfortunately it seems like mutagen doesn't have a ssh config file (yet?), so it needs to be in the user ssh_config.

  • Issue https://github.com/mutagen-io/mutagen/issues/301

afbjorklund avatar Nov 15 '21 17:11 afbjorklund

Simple enough, though.

  1. Set up a ssh "host" limactl show-ssh --format config default >>~/.ssh/config
$ ssh lima-default uname
Linux
$ ssh lima-default grep PRETTY /etc/os-release
PRETTY_NAME="Ubuntu 21.10"
  1. Set up a ssh "sync" mutagen sync create /tmp/test lima-default:/tmp/test
$ mutagen sync list
--------------------------------------------------------------------------------
Identifier: sync_WxCidKXaUaHGgkXXkRWZp5UBANGKpZlJ8JmJ0VgKepQ
Labels: None
Alpha:
	URL: /tmp/test
	Connection state: Connected
Beta:
	URL: lima-default:/tmp/test
	Connection state: Connected
Status: Watching for changes
--------------------------------------------------------------------------------

Works as advertised.


So if it was included, the YAML file would look something like:

mounts:
  - location: "~"
    writable: false
  - location: "/tmp/lima"
    writable: true
mutagen:
  - location: "/tmp/test"
    mode: "two-way-safe"
anders@lima-default:/tmp$ findmnt -T lima
TARGET    SOURCE     FSTYPE     OPTIONS
/tmp/lima :/tmp/lima fuse.sshfs rw,nosuid,nodev,relatime,user_id=1000,group_id=1000,allow_other
anders@lima-default:/tmp$ findmnt -T test
TARGET SOURCE    FSTYPE OPTIONS
/      /dev/vda1 ext4   rw,relatime,discard,errors=remount-ro

With some arbitrary mapping of the mutagen YAML configuration:

https://mutagen.io/documentation/introduction/configuration#configuration-files

$ mutagen sync list --long
--------------------------------------------------------------------------------
Identifier: sync_WxCidKXaUaHGgkXXkRWZp5UBANGKpZlJ8JmJ0VgKepQ
Labels: None
Configuration:
	Synchronization mode: Default (Two Way Safe)
	Maximum allowed entry count: Default (2⁶⁴−1)
	Maximum staging file size: Default (18 EB)
	Symbolic link mode: Default (Portable)
	Ignore VCS mode: Default (Propagate)
	Ignores: None
Alpha configuration:
	URL: /tmp/test
	Watch mode: Default (Portable)
	Watch polling interval: Default (10 seconds)
	Probe mode: Default (Probe)
	Scan mode: Default (Accelerated)
	Stage mode: Default (Mutagen Data Directory)
	File mode: Default (0600)
	Directory mode: Default (0700)
	Default file/directory owner: Default
	Default file/directory group: Default
Beta configuration:
	URL: lima-default:/tmp/test
	Watch mode: Default (Portable)
	Watch polling interval: Default (10 seconds)
	Probe mode: Default (Probe)
	Scan mode: Default (Accelerated)
	Stage mode: Default (Mutagen Data Directory)
	File mode: Default (0600)
	Directory mode: Default (0700)
	Default file/directory owner: Default
	Default file/directory group: Default
Alpha:
	Connection state: Connected
Beta:
	Connection state: Connected
Status: Watching for changes
--------------------------------------------------------------------------------

afbjorklund avatar Nov 15 '21 20:11 afbjorklund

My two cents as the developer of Mutagen:

Mutagen works well if you have access to its full configuration and you can tweak it to your needs (e.g. via manual usage or Mutagen Compose) or if you're using it as part of some higher-level tooling that configures it in a way where it works for 99% of users automatically (e.g. the way ddev does).

It doesn't work well as a general purpose caching mechanism, and that was what really hampered the Docker for Mac integration (i.e. the simple on/off switch design and inability to see what's going on internally).

The inotify issue is something I'm working on addressing via use of fanotify and its enhancements in Linux 5.1+, but it's probably going to be restricted to very controlled environments (e.g. the sidecar container used by Mutagen Compose) because it requires CAP_SYS_ADMIN and CAP_DAC_READ_SEARCH and has somewhat odd path handling behavior with bind mounts.

Conflicts and handling of problematic content have been significantly improved in Mutagen v0.12, so hopefully that's less of an issue.

Anyway, I won't interrupt the discussion further, but feel free to reach out via email ([email protected]) or Slack if you have any questions.

xenoscopic avatar Nov 15 '21 23:11 xenoscopic

My idea was to leave it as opt-in, just wanted it easier to do... I don't think it should be bundled, more treated as a tool (like rsync)

rsync -av /tmp/test/ lima-default:/tmp/test/ rsync -av lima-default:/tmp/test/ /tmp/test/

scp /tmp/test/foo lima-default:/tmp/test/foo scp lima-default:/tmp/test/bar /tmp/test/bar

  1. Just sync filesystem state between VM and host.

The two biggest lima improvements are probably 1) acknowledging that the VM even exists 2) adding it as a ssh "host"

That opens up for a lot of different use cases, where the "built-in" mounts fall short. As all of them will do, eventually.

afbjorklund avatar Nov 16 '21 06:11 afbjorklund

  • Is there / will there be a binary with the patch, that can be installed without nix? For both Intel and ARM. (Would it be possible to crosscompile ARM target on an Intel host?)

@AkihiroSuda should we consider using go:embed to distribute static qemu-system-ARCH binaries? Of course this would require a Workflow to maintain builds, but it would allow us to

  • Remove the dependency on having qemu installed
  • Apply the 9p support patch to Darwin builds

trevor403 avatar Jan 13 '22 04:01 trevor403

I don't think we should use go:embed, but we can consider putting patched QEMU binaries under /usr/local/share/lima, if it is acceptable for Homebrew/MacPorts/nix.

AkihiroSuda avatar Jan 13 '22 05:01 AkihiroSuda

What is the difference between virtio-fs, virtio-9p, and virtio-blk? And which is simply called virtio?

CodeWithShreyans avatar Jan 13 '22 05:01 CodeWithShreyans