lima
lima copied to clipboard
Filesystem Sharing
Hello. I would like to give some ideas and advices about filesystem sharing.
So, since we are using QEMU, lets see which options do we have:
- VirtioFS. Looking very cool, seems to have really good performance, but works only on Linux hosts. It is very optimised for using in virtual machines, it even uses DAX (direct access) for files, so there's no need to copy files over network, they're just in the shared RAM between VM and host.
- VirtFS (9P). I've tried to use it, but it's incredible slow. Really. Using just
git statusin shared directory with middle size project takes at least half a minute. I would rather just place files in VM and access them via some remote file access protocol and use vscode with remote access (sad, but they're proprietary). I think that it is so slow because it is sync. Whenever you read some file, do stat call etc. you have to wait for this operation to end. - Just sync filesystem state between VM and host.
- Write custom FUSE driver with asynchronous protocol and multithreading support. Theoretically this can be more performant than 9P, but not sure.
My current plan:
-
Step 1: Use 9p over virtio serial: https://github.com/AkihiroSuda/lima/issues/1
-
Step 2: Implement vsock support in QEMU for Darwin, and switch away from virtio serial to vsock
-
Step 3: Implement virtiofs support in QEMU for Darwin
In parallel, we can consider supporting mutagen integration as well. IIUC it is used by Docker for Mac as well.
https://mutagen.io/
Step 3: Implement virtiofs support in QEMU for Darwin
Do you think this is possible? It requires to be able to share mmap'ped memory pages between host and guest. It this is possible, it would be cool.
Just so you know: There is a patch for Darwin 9p support developed within the Nixpkgs project. I have adapted it to current QEMU from a patchset originally by Keno Fischer.
The people over at Nixpkgs would certainly be interested in any effort to accelerate file sharing using VirtioFS, which is why I’m now subscribed here.
Thanks @mroi !
- The patch seems proposed to the upstream in 2018, but what's the current status of upstreaming?
- Is there / will there be a binary with the patch, that can be installed without nix? For both Intel and ARM. (Would it be possible to crosscompile ARM target on an Intel host?)
@mroi Aside from 9p, do you know whether somebody is working on supporting vsock?
For reimplementing the even notification system that is currently implemented by running ssh -L /run/user/<UID>/lima-guestagent.sock:ga.sock.
I am preparing to propose the revised patch upstream. Have not had sufficient time to push this forward, but I hope to do this next week. It would certainly be good to have this upstream.
Regarding binaries: Nix has a binary cache, but the packages do not run standalone (they depend on other Nix packages). The patch however should apply to the vanilla QEMU sources, so you should be able to just recompile QEMU 6 with it.
I’m not aware of vsock developments. Unfortunately I’m not familiar with QEMU development internals at all. I just happened to have a need for 9p and worked on this patch.
Thanks!
The patch seems to have issues 😞 https://github.com/NixOS/nixpkgs/pull/122420#issuecomment-846365328
Until we can get virtio-9p-pci for macOS hosts in the QEMU upstream, I was planning to use 9P over virtserial, but it doesn't seem easy as expected 😞
diff --git a/pkg/qemu/qemu.go b/pkg/qemu/qemu.go
index d9b0778..19f0f75 100644
--- a/pkg/qemu/qemu.go
+++ b/pkg/qemu/qemu.go
@@ -154,7 +154,7 @@ func Cmdline(cfg Config) (string, []string, error) {
// Parallel
args = append(args, "-parallel", "none")
- // Serial
+ // Legacy Serial
serialSock := filepath.Join(cfg.InstanceDir, "serial.sock")
if err := os.RemoveAll(serialSock); err != nil {
return "", nil, err
@@ -167,7 +167,17 @@ func Cmdline(cfg Config) (string, []string, error) {
args = append(args, "-chardev", fmt.Sprintf("socket,id=%s,path=%s,server,nowait,logfile=%s", serialChardev, serialSock, serialLog))
args = append(args, "-serial", "chardev:"+serialChardev)
- // We also want to enable vsock and virtfs here, but QEMU does not support vsock and virtfs for macOS hosts
+ // vport for 9p
+ vportSock := filepath.Join(cfg.InstanceDir, "vport.sock")
+ if err := os.RemoveAll(vportSock); err != nil {
+ return "", nil, err
+ }
+ const vportChardev = "char-vport"
+ args = append(args, "-device", "virtio-serial")
+ args = append(args, "-chardev", fmt.Sprintf("socket,id=%s,path=%s,server,nowait", vportChardev, vportSock))
+ args = append(args, "-device", fmt.Sprintf("virtserialport,chardev=%s,name=lima", vportChardev))
+
+ // TODO: use virtio-9p-pci when QEMU supports it for macOS hosts
// QEMU process
args = append(args, "-name", "lima-"+cfg.Name)
package main
import (
"fmt"
"os"
"syscall"
)
func main() {
if err := xmain(); err != nil {
fmt.Fprintln(os.Stderr, err)
os.Exit(1)
}
}
func xmain() error {
devPath := "/dev/vport5p1"
mntPath := "/mnt/foo"
devFd, err := syscall.Open(devPath, syscall.O_RDWR|syscall.O_NONBLOCK, 0600)
if err != nil {
return err
}
return syscall.Mount(
"",
mntPath,
"9p",
0,
fmt.Sprintf("trans=fd,rfdno=%d,wfdno=%d", devFd, devFd),
)
}
# go run main.go
invalid argument
exit status 1
# dmesg
...
[27353.823239] kernel write not supported for file /vport5p1 (pid: 2243 comm: kworker/3:1)
The error is happening because the vport device somehow wires up both write_iter and write: https://github.com/torvalds/linux/blob/v5.11/fs/read_write.c#L540-L545
I guess I have to inject some pipe as a "shim" fd, or use ssh instead of virtserial.
virtserial seems slower than ssh :(
virtserial
qemu flags: -device virtio-serial -chardev socket,id=foo,path=foo.sock,server,nowait -device virtserialport,chardev=foo.sock,name=foo
[host]$ socat file:urandom-1G unix-connect:foo.sock
[guest]$ time sha256sum /dev/virtio-ports/foo
real 0m10.022s
ssh
[host] $ cat urandom-1G | lima time sha256sum -
real 0m7.673s
virtserial seems slower than ssh
Given that we seem to be stuck with ssh, at least for now, did you check if selecting a different cipher makes any difference for throughput? It seems like we are using the default:
debug1: kex: server->client cipher: [email protected] MAC: <implicit> compression: none
debug1: kex: client->server cipher: [email protected] MAC: <implicit> compression: none
I remember reading that aes128-ctr was twice as fast as [email protected] when running on a CPU with AES instructions, but not sure how much that affects throughput.
I'm somewhat confused though, as even when I add -c aes128-ctr to the ssh command, it still ends up using [email protected], so I don't know how to configure it.
Given that we seem to be stuck with ssh, at least for now, did you check if selecting a different cipher makes any difference for throughput?
Haven't looked into. Help wanted :pray:
Btw QEMU Samba seems roughly two times faster than sshfs
[root@lima-archlinux ~]# time sha256sum /mnt/smb/ubuntu-21.04-desktop-amd64.iso
fa95fb748b34d470a7cfa5e3c1c8fa1163e2dc340cd5a60f7ece9dc963ecdf88 /mnt/smb/ubuntu-21.04-desktop-amd64.iso
real 0m15.578s
user 0m7.054s
sys 0m6.259s
[root@lima-archlinux ~]# time sha256sum /tmp/lima/ubuntu-21.04-desktop-amd64.iso
fa95fb748b34d470a7cfa5e3c1c8fa1163e2dc340cd5a60f7ece9dc963ecdf88 /tmp/lima/ubuntu-21.04-desktop-amd64.iso
real 0m29.707s
user 0m9.244s
sys 0m6.370s
diff --git a/pkg/qemu/qemu.go b/pkg/qemu/qemu.go
index 796c6ee..1524a24 100644
--- a/pkg/qemu/qemu.go
+++ b/pkg/qemu/qemu.go
@@ -193,7 +193,7 @@ func Cmdline(cfg Config) (string, []string, error) {
// CIDR is intentionally hardcoded to 192.168.5.0/24, as each of QEMU has its own independent slirp network.
// TODO: enable bridge (with sudo?)
args = append(args, "-net", "nic,model=virtio")
- args = append(args, "-net", fmt.Sprintf("user,net=192.168.5.0/24,hostfwd=tcp:127.0.0.1:%d-:22", y.SSH.LocalPort))
+ args = append(args, "-net", fmt.Sprintf("user,net=192.168.5.0/24,hostfwd=tcp:127.0.0.1:%d-:22,smb=/tmp/lima", y.SSH.LocalPort))
// virtio-rng-pci acceralates starting up the OS, according to https://wiki.gentoo.org/wiki/QEMU/Options
args = append(args, "-device", "virtio-rng-pci")
Needs https://github.com/Homebrew/homebrew-core/pull/80171 to be merged (which may take time)
Samba is now available on homebrew (https://github.com/Homebrew/homebrew-core/blob/master/Formula/samba.rb), so I'm planning to replace sshfs with samba soon.
Samba will be executed with slirp_add_exec: https://github.com/qemu/qemu/blob/4cc10cae64c51e17844dc4358481c393d7bf1ed4/net/slirp.c#L964
So Samba will not listen on any actual TCP port.
cc @jandubois
@AkihiroSuda That is awesome to hear. From the benchmarks i have seen so far, it is supposedly 2x faster with most workloads.
What would it take for us to support inotify events?
It would be interesting to see some benchmarks between sshfs and virtio-9p-pci and nfs and cifs/smb...
The virtfs authors seemed convinced: https://www.kernel.org/doc/ols/2010/ols2010-pages-109-120.pdf
On Linux QEMU using virtfs is a lot faster than sshfs and even 9p. If anything is slowed down on macOS due to kernel lockdown/security don't take it as given for Linux users!
-–› Please always consider Linux users in mind since lima itself becomes more and more a tool for cross-platform development (macOS, Linux, WSL/Windows). Especially k8s/k3s is really nyce scenerio! Don't make your decisions based on macOS only! Thank you!
Docker for Desktop on Windows uses 9p (they switched away from SMB which they used before).
Docs for linux: https://www.linux-kvm.org/page/9p_virtio
Aslo check out how wimpy uses hardware-acceleration/virt*-kernel modules in Linux w/ qemu v6.x (yup 6.0+!): https://github.com/wimpysworld/quickemu/blob/master/quickemu
It's not Docker per se, but WSL2 that uses 9P for any WSL VM: https://devblogs.microsoft.com/commandline/a-deep-dive-into-how-wsl-allows-windows-to-access-linux-files/
I've been testing out WSL2 integration for other team members, and the filesystem performance is still not to a point where it's usable for large projects. For example, PHPStorm hangs and crashes trying to index a basic Drupal 9 codebase over the \\wsl$ share: https://github.com/drud/ddev/issues/3366
To date, mutagen (which is built into ddev) is giving our team the best experience. That work came about because Docker Desktop dropped their alpha mutagen support, and I think it would be really nice if support for it was at the lima (or colima) layer instead of at the next layer up.
I'm not sure it's worth the development effort to only get a relatively small performance improvement. If performance is any more than 20-50% slower than native IO, it's still too slow for most of our uses.
What would be needed for integrating mutagen ? It seems like it should work out-of-the-box, over the ssh connection ?
For docker-machine, the support was more "hands-on". The user had to run the scp and rsync and sshfs themselves...
Mutagen works, if you know your ssh and mutagen stuff.
I added an entry to ~/.ssh/config to disable host key check and specify the port, to make it easier to interact with rsync/mutagen:
HOST lima
Hostname localhost
Port 60022
StrictHostKeyChecking no
UserKnownHostsFile /dev/null
Then you can use a command like this to create a sync of the current working directory:
lima mkdir -p $PWD # create the needed folders
lima chmod 777 $PWD $(dirname $PWD) # mutagen will write a folder in the parent directory, so you need write access there too.
mutagen create \
--sync-mode two-way-resolved \
--stage-mode=neighboring \
--ignore-vcs \
--symlink-mode posix-raw \
--default-file-mode 0666 \
--default-directory-mode 0777 \
--name "$(basename $PWD)" \
$PWD lima:$PWD
mutagen monitor "$(basename $PWD)" # to watch your sync progress
Then stuff like lima nerdctl run -v $PWD:$PWD -w $PWD ubuntu ls will just run as expected. Just make sure your paths in the vm and on your machine are identical.
This could be a separate "mount" type, perhaps ? Similar to the current shorthand for setting up sshocker (reverse sshfs)
Didn't understand the part of the paths needing to the same on both sides, but will have to try it myself. (EDIT: it doesn't)
It could but... I feel like it won't be intuitive or stable. There are issues with very large amounts of files (multiple projects) write privileges, the amount of inotify watches on linux and there is conflict resolution etc. I can manually enable sync's as needed. But i'm not too sure it will be a great fit as a solution in the config file.
But maybe i'm also not the average user ;)
Docker for mac had mutagen once as well https://github.com/docker/for-mac/issues/1592#issuecomment-678397258 There implementation was pretty good, it was integrated into the ui and showed sync progress etc. But you were on your own if it failed and it used a lot of resources when you synced larger directories.
At least you can use limactl show-ssh --format config default to set up the config, for host "lima-default" (etc).
To me it actually makes more sense to set up such syncs as needed, than to export your entire home directory...
Unfortunately it seems like mutagen doesn't have a ssh config file (yet?), so it needs to be in the user ssh_config.
- Issue https://github.com/mutagen-io/mutagen/issues/301
Simple enough, though.
- Set up a ssh "host"
limactl show-ssh --format config default >>~/.ssh/config
$ ssh lima-default uname
Linux
$ ssh lima-default grep PRETTY /etc/os-release
PRETTY_NAME="Ubuntu 21.10"
- Set up a ssh "sync"
mutagen sync create /tmp/test lima-default:/tmp/test
$ mutagen sync list
--------------------------------------------------------------------------------
Identifier: sync_WxCidKXaUaHGgkXXkRWZp5UBANGKpZlJ8JmJ0VgKepQ
Labels: None
Alpha:
URL: /tmp/test
Connection state: Connected
Beta:
URL: lima-default:/tmp/test
Connection state: Connected
Status: Watching for changes
--------------------------------------------------------------------------------
Works as advertised.
So if it was included, the YAML file would look something like:
mounts:
- location: "~"
writable: false
- location: "/tmp/lima"
writable: true
mutagen:
- location: "/tmp/test"
mode: "two-way-safe"
anders@lima-default:/tmp$ findmnt -T lima
TARGET SOURCE FSTYPE OPTIONS
/tmp/lima :/tmp/lima fuse.sshfs rw,nosuid,nodev,relatime,user_id=1000,group_id=1000,allow_other
anders@lima-default:/tmp$ findmnt -T test
TARGET SOURCE FSTYPE OPTIONS
/ /dev/vda1 ext4 rw,relatime,discard,errors=remount-ro
With some arbitrary mapping of the mutagen YAML configuration:
https://mutagen.io/documentation/introduction/configuration#configuration-files
$ mutagen sync list --long
--------------------------------------------------------------------------------
Identifier: sync_WxCidKXaUaHGgkXXkRWZp5UBANGKpZlJ8JmJ0VgKepQ
Labels: None
Configuration:
Synchronization mode: Default (Two Way Safe)
Maximum allowed entry count: Default (2⁶⁴−1)
Maximum staging file size: Default (18 EB)
Symbolic link mode: Default (Portable)
Ignore VCS mode: Default (Propagate)
Ignores: None
Alpha configuration:
URL: /tmp/test
Watch mode: Default (Portable)
Watch polling interval: Default (10 seconds)
Probe mode: Default (Probe)
Scan mode: Default (Accelerated)
Stage mode: Default (Mutagen Data Directory)
File mode: Default (0600)
Directory mode: Default (0700)
Default file/directory owner: Default
Default file/directory group: Default
Beta configuration:
URL: lima-default:/tmp/test
Watch mode: Default (Portable)
Watch polling interval: Default (10 seconds)
Probe mode: Default (Probe)
Scan mode: Default (Accelerated)
Stage mode: Default (Mutagen Data Directory)
File mode: Default (0600)
Directory mode: Default (0700)
Default file/directory owner: Default
Default file/directory group: Default
Alpha:
Connection state: Connected
Beta:
Connection state: Connected
Status: Watching for changes
--------------------------------------------------------------------------------
My two cents as the developer of Mutagen:
Mutagen works well if you have access to its full configuration and you can tweak it to your needs (e.g. via manual usage or Mutagen Compose) or if you're using it as part of some higher-level tooling that configures it in a way where it works for 99% of users automatically (e.g. the way ddev does).
It doesn't work well as a general purpose caching mechanism, and that was what really hampered the Docker for Mac integration (i.e. the simple on/off switch design and inability to see what's going on internally).
The inotify issue is something I'm working on addressing via use of fanotify and its enhancements in Linux 5.1+, but it's probably going to be restricted to very controlled environments (e.g. the sidecar container used by Mutagen Compose) because it requires CAP_SYS_ADMIN and CAP_DAC_READ_SEARCH and has somewhat odd path handling behavior with bind mounts.
Conflicts and handling of problematic content have been significantly improved in Mutagen v0.12, so hopefully that's less of an issue.
Anyway, I won't interrupt the discussion further, but feel free to reach out via email ([email protected]) or Slack if you have any questions.
My idea was to leave it as opt-in, just wanted it easier to do... I don't think it should be bundled, more treated as a tool (like rsync)
rsync -av /tmp/test/ lima-default:/tmp/test/ rsync -av lima-default:/tmp/test/ /tmp/test/
scp /tmp/test/foo lima-default:/tmp/test/foo scp lima-default:/tmp/test/bar /tmp/test/bar
- Just sync filesystem state between VM and host.
The two biggest lima improvements are probably 1) acknowledging that the VM even exists 2) adding it as a ssh "host"
That opens up for a lot of different use cases, where the "built-in" mounts fall short. As all of them will do, eventually.
- Is there / will there be a binary with the patch, that can be installed without nix? For both Intel and ARM. (Would it be possible to crosscompile ARM target on an Intel host?)
@AkihiroSuda should we consider using go:embed to distribute static qemu-system-ARCH binaries?
Of course this would require a Workflow to maintain builds, but it would allow us to
- Remove the dependency on having qemu installed
- Apply the 9p support patch to Darwin builds
I don't think we should use go:embed, but we can consider putting patched QEMU binaries under /usr/local/share/lima, if it is acceptable for Homebrew/MacPorts/nix.
What is the difference between virtio-fs, virtio-9p, and virtio-blk? And which is simply called virtio?