(WiP) zfs pool draft
This PR aims to introduce two new qvm-pool pool modules with this interface:
[user@dom0 ~]$ qvm-pool --help-drivers
DRIVER OPTIONS
zfs_encrypted zpool_name, ask_password_domain, unload_timeout
zfs_zvol block_devices
zfs_zvolwhich implements VM storage as ZFSzvolvolumes. In the current implementation it also deals withzpoolcreation on eachblock_devicesparameter since that was the easiest way to get started from a clean slate when testing. If there are valid use-case arguments for splitting that off to a third module that could be done (i.e. if anyone thinks that it would be useful to implement azvolpool on top of an existing dataset).zfs_encryptionwhich largely inherits fromzfs_zvol, but on top of encrypted datasets insidezpool_namethat acts as "encryption key groups." The keys are optionally unloaded after a pool inactivity timeout (unload_timeout). The keys are loaded by passing a raw file descriptor to an invocation ofzfs load-key {dataset}from QRexec servicequbes.AskPasswordin a target domain.
When used, this results in datasets and zvols in this form:
$ zfs list -o name,type,encryption
qbs/encryption/qbsdev filesystem aes-256-gcm
qbs/encryption/qbsdev/import filesystem aes-256-gcm
qbs/encryption/qbsdev/import/zfsdev filesystem aes-256-gcm
qbs/encryption/qbsdev/vm filesystem aes-256-gcm
qbs/encryption/qbsdev/vm/zfsdev filesystem aes-256-gcm
qbs/encryption/qbsdev/vm/zfsdev/private volume aes-256-gcm
qbs/encryption/qbsdev/vm/zfsdev/volatile volume aes-256-gcm
qbs/vm filesystem none
qbs/vm/plainbrowser filesystem none
qbs/vm/plainbrowser/private volume none
qbs/vm/plainbrowser/volatile volume none
Adding those was a matter of:
$ qvm-pool -a qbs zfs_zvol -o 'block_devices=/dev/loopMYFILE'
$ qvm-pool -a qbsdev zfs_encrypted -o 'zpool_name=qbs'
For the purpose of key entry I am currently using this implementation of /etc/qubes-rpc/qubes.AskPassword in the ask_password_domain, which should probably be reworked (remember to chmod +x):
#!/bin/sh
read -r -N 100 -s -t 5 subject
# TODO why does this not work, does it need to strip newline?
# might be fixed now, according to @marmarek's suggestion in comment below:
# [ "${subject}" = "${subject//[^-/_a-z:A-Z0-9]}" ] || exit 1
export DISPLAY="${DISPLAY:-:0}"
zenity --forms \
--text "${QREXEC_REMOTE_DOMAIN}:qubes.AskPassword" \
--add-password "Password for ${subject//[^a-zA-Z]}" 2>>/tmp/askpw.log
This requires having ZFS installed from zfs-on-linux with the pyzfs3 module built.
To get that going (in addition to their instructions) I needed:
sudo qubes-dom0-update kernel-devel git sysstat bc python3-devel
./autogen.sh && \
./configure --with-config=srpm --enable-pyzfs && \
make -j3 pkg-utils rpm-dkms && \
sudo dnf install ./libuutil1-0.8.0-1.qbs4.0.x86_64.rpm \
./libzfs2-0.8.0-1.qbs4.0.x86_64.rpm \
./libnvpair1-0.8.0-1.qbs4.0.x86_64.rpm \
./python3-pyzfs-0.8.0-1.qbs4.0.noarch.rpm \
./zfs-0.8.0-1.qbs4.0.x86_64.rpm
# not sure if we need these:
sudo dnf install ./libzpool2-0.8.0-1.qbs4.0.x86_64 \
./libzfs2-devel-0.8.0-1.qbs4.0.x86_64.rpm
# install DKMS kernel module, this will take a while since it needs
# to compile a module for each dom0 kernel currently installed.
# may want to bring down the number of kernels:
# rpm -qa kernel
# sudo dnf remove kernel-4.14.74-1.pvops.qubes.x86_64
# pay attention that we want the "noarch" and not the "src" rpm:
sudo dnf install ./zfs-dkms-0.8.0-1.qbs4.0.noarch.rpm
This is merely a draft, as witnessed by the many TODO comments, but it does "work," and I thought it might be useful to place here for review, and in case someone wants to collaborate on this with me.
NOTABLY:
- all the volume
_importfunctions are probably horribly broken - the size reporting does not seem 100% accurate / implemented for all cases
- currently no snapshotting takes place, even though there is some scaffolding in place for that
- the "default" options for zpools/zvols/datasets are not uniformly applied
- the suggestions for updates to the documentation are probably not 100% correct either
ping:
- @Rudd-O who also seems to dabble in Qubes + ZFS
- @tykling who provided invaluable help deciphering ZFS options, and might have further suggestions
- @marmarek who provided a ton of help and suggestions re: the inner workings of Qubes and
qubes-core-admin
# TODO why does this not work, does it need to strip newline? # [ "${subject}" = "${subject}//[^-/_a-z:A-Z0-9]}" ] || exit 1
Extra } in the middle.
Thank you for the review, I will try to get around to addressing this at some point and keep this updated.
This PR is relevant to the discussion at https://github.com/QubesOS/qubes-issues/issues/1293 (I forgot to mention that in the brief).
I'm extremely interested in this. Do you need funding to get this to pass tests and work? How much would you consider to pique your interest?
@Rudd-O I am working on this on my own time (and am using this patch set as-is on my laptop), but if you are able to eventually allocate funds to the Qubes project towards @marmarek's time spent reviewing and answering questions about the Qubes APIs, that would make me feel less guilty for taking up his time.
We also hit some sore spots re: the qvm-pool command-line interface that it would be great to have someone clean up.
@marmarek would you like to discuss funding for this specific bug?
(Lack of this is blocking me getting Qubes Network Server ported to Qubes 4.1 for a variety of reasons that would be super hard to explain shortly.)
I'm fine with reviewing it whenever you give info it's ready for the next round of review.
If you like to help financially, contribute to our OpenCollective, so we can fund more people offloading other tasks from me.
Review comment here. The structure of the datasets is not quite what it should be. According to the inter-distro working group on getting ZFS on root file systems, the structure should be more or less like this:
rpool/ROOT/qubes <- the OS goes here, following the pattern
root pool name, ROOT, OS name (ubuntu, fedora)
rpool/BOOT/qubes <- if you'll use /boot in ZFS, not sure its status
rpool/USERDATA/VMs <- this can be named whatever you want, doesn't even
have to be on rpool, but a good default would
obviously be called VMs or vms or vm
rpool/USERDATA/VMs/plainbrowser
rpool/USERDATA/VMs/plainbrowser/private
rpool/USERDATA/VMs/plainbrowser/volatile
The spec appears to be what will be followed by Fedora as well.
Ref: https://docs.google.com/document/d/1m9VWOjfmdbujV4AQWzWruiXYgb0W4Hu0D8uQC_BWKjk/edit
@marmarek 1) how do I contribute to the OpenCollective? 2) how do I earmark contributions for this project?
@Rudd-O to clarify, which datasets do you propose prefixing with /USERDATA/ ? The encrypted ones, when adding to an existing pool?
EDIT: Did you link to the right document? This one seems to be an unpublished draft detailing zfs-on-root for Ubuntu systems (neither of which applies here), and there's no mention of an inter-distro working group?
EDIT 2: I didn't have a use-case for it, but adding an unencrypted wrapper in addition to the encrypted one, and allowing both to take arguments like prefix=/USERDATA,zpool=rpool should be easy enough if that would work for you?
Not sure I understand. Why do we need unencrypted anything?
@marmarek 1) how do I contribute to the OpenCollective?
https://opencollective.com/qubes-os
- how do I earmark contributions for this project?
You can add a comment when you donate, but read also this: https://www.qubes-os.org/donate/#do-donors-get-personalized-support-or-special-feature-requests
@Rudd-O I'm assuming that people are using full-disk encryption, so additional encryption keys to enter for e.g. your VM templates or your jazz music collection might be overkill UX-wise.
That was my assumption too — FDE or no game. I know people want to have filesystem-level encryption, but that discloses too much — even the names of the datasets are not okay for adversaries to know.
https://www.qubes-os.org/donate/#do-donors-get-personalized-support-or-special-feature-requests
Huge bummer. I would have loved to earmark the donation to this specific feature. Lemme know when the policy changes so I can negotiate what the developers need to get this through.
@DemiMarie Thank you for your review. I've since split the code out into its own package as advised by @marmarek, so this particular PR is abandoned. I've used this module as the primary pool driver on my personal laptop and have been quite happy with it.
You can find the revised code at: https://github.com/cfcs/qubes-storage-zfs
Twice I had to boot from a recovery disk because upstream API changes broke qubesd to the point where nothing worked. Copying changed code from a development vm to dom0 without mounting the file system turns out to not be easy when qubesd is broken. I've also had to manually edit the kernel header files because they either didn't compile (???) or were incompatible with the ZFS DKMS modules (that's to be expected I guess). So not quite ready for the primetime yet.
@DemiMarie Thank you for your review. I've since split the code out into its own package as advised by @marmarek, so this particular PR is abandoned. I've used this module as the primary pool driver on my personal laptop and have been quite happy with it.
You’re welcome! Even if you don’t intend to get this merged, it might be useful to have an open PR for code review purposes.
You can find the revised code at: https://github.com/cfcs/qubes-storage-zfs
Would you be interested in submitting it as a contrib package? I understand that it isn’t ready for prime time yet, but if/when that changes, this would allow users to obtain it via qubes-dom0-update. This would most likely require adding the DKMS package first, if I understand correctly.
Twice I had to boot from a recovery disk because upstream API changes broke
qubesdto the point where nothing worked. Copying changed code from a development vm to dom0 without mounting the file system turns out to not be easy whenqubesdis broken. I've also had to manually edit the kernel header files because they either didn't compile (???) or were incompatible with the ZFS DKMS modules (that's to be expected I guess). So not quite ready for the primetime yet.
I have actually had to manually edit stuff in dom0 before (mostly qubes.xml). So I understand! I wonder if the compilation errors are due to dom0 having an old version of GCC.
One other question: what are the advantages of this over the BTRFS pool? RAID-Z is the only one that comes to my mind, but there could easily be others I am missing.
@DemiMarie Thank you for your review. I've since split the code out into its own package as advised by @marmarek, so this particular PR is abandoned. I've used this module as the primary pool driver on my personal laptop and have been quite happy with it.
You’re welcome! Even if you don’t intend to get this merged, it might be useful to have an open PR for code review purposes.
You can find the revised code at: https://github.com/cfcs/qubes-storage-zfs
Would you be interested in submitting it as a contrib package? I understand that it isn’t ready for prime time yet, but if/when that changes, this would allow users to obtain it via
qubes-dom0-update. This would most likely require adding the DKMS package first, if I understand correctly.
That is the hope, that it can end up as a contrib package.
Like I said I've been using it on my personal Qubes installations since I posted this, and it's worked pretty well for me, but there are still some gotchas that need to be hashed out, and I'd like to have it better documented as well.
I'm kind of worried about releasing it as a contrib package and then having people unable to boot their VMs due to some kind of DKMS issue --- right now, if something goes wrong, all of qubesd stops working, which means file copy to/from dom0 also stops working, and copy-paste, so it can be a grueling experience to have to recover from that situation, even if there's an upstream fix. I'd like to find a solution to that particular problem before encouraging other people to use it for anything but test systems.
Twice I had to boot from a recovery disk because upstream API changes broke
qubesdto the point where nothing worked. Copying changed code from a development vm to dom0 without mounting the file system turns out to not be easy whenqubesdis broken. I've also had to manually edit the kernel header files because they either didn't compile (???) or were incompatible with the ZFS DKMS modules (that's to be expected I guess). So not quite ready for the primetime yet.I have actually had to manually edit stuff in dom0 before (mostly
qubes.xml). So I understand! I wonder if the compilation errors are due to dom0 having an old version of GCC.
Note sure, but compiler and header files being out of sync does seem like a probable reason. It's been a while since I had anything as drastic as that happen though.
One other question: what are the advantages of this over the BTRFS pool? RAID-Z is the only one that comes to my mind, but there could easily be others I am missing.
Main feature over BTRFS is encryption, so your incremental backups are encrypted at-rest and their integrity can be validated on the backup system without loading the key. Main drawback compared to BTRFS is that ZFS lacks a COW file cloning system call on Linux (BTRFS has that). Not really of big consequence for this particular codebase, since most of the magic happens with zvols which do support cow cloning, but that's my impression of the overall state -- otherwise they're fairly similar. The ability to import ZFS data on FreeBSD systems is also an appealing factor.
@cfcs : was the effort abandoned? Have you seen #7009 bounty offer for exactly this work/from @Rudd-O ?
zpool live deduplication is a big advantage over brtfs from what I've been reading. Please feel free to shed some lights in the ongoing discussion of pool deduplication, here: https://forum.qubes-os.org/t/pool-level-deduplication/12654
https://github.com/cfcs/qubes-storage-zfs
If this were released as a contrib package with the to-do items checked off such that I can trust my own systems with it, I would honor my promise of the bounty I offered before in #7009.
@tlaurion (EDIT)
@cfcs : was the effort abandoned? Have you seen #7009 bounty offer for exactly this work/from @Rudd-O ?
Ah, no, I had not seen that.
In the sense that I use it every day on my own machines, the project is not abandoned, but it hasn't seen meaningful progress for a while now. This PR is effectively abandoned, since I deemed the approach @DemiMarie suggested of a plugin/contrib package better than merging it into qubes-core-admin (see @Rudd-O 's link to my package); but the contrib package does work and I update it when I run into things that annoy me.
Judging by the helpful issue posts on that repo, @DemiMarie has also taken it for a spin, but I don't know of any other users / testers.
There is also a fork by @ayakael here https://github.com/ayakael/qubes-storage-zfs that seems to do something with qubes-builder. I don't think they ever sent any pull requests, but I've picked up some of the patches and it's probably worth having a look at their changes if you're going to dig into this project.
It feels to me like there's some way to go before this would be something I'd recommend to people who are not comfortable fixing DKMS build errors; I've personally had a number of issues with the kernel headers/kernel being out of sync with the fc32 DKMS packages for ZFS. Basically you need to plan 1-2 days of downtime whenever you type qubes-dom0-update, and I don't have a good solution to that problem.
To sum up there are some things to deal with before I think it makes sense to offer this to non-expert users:
- Figuring out a stable update story that doesn't cause massive pain on a daily basis (hard).
- Fixing TODOs documented in the README (probably not so hard).
- Reading up on the Qubes 4.0 -> 4.x changelogs and figuring out how the APIs are intended to be used. At the moment there's a number of things that can go wrong, and when they do go wrong, you need to manually stop&start
qubesd. Potentially this can be solved by restructuring the code to do the "dangerous" stuff in places wherequbesdpresently expects exceptions; potentially this will require adding sometry..excepttoqubes-core-admin. I felt figuring this out on my own without pestering the upstream Qubes devs was a hard task. - Finding a solution to some of the other problems, like the one where the COW updates to the template's storage (that are thrown away on next reboot anyway) are written to the template's storage instead of the ZFS storage. This is probably also hard without spending time with upstream devs, or a lot of time reading through the code and documentation.
I'd be happy to review and merge patches to my repo, and I'd also be fine with someone forking it into their own contrib package, copying what they like and deleting the rest. As I've explained before, packaging is not really my strong side, so I doubt I will be doing any of that on my own in the near future. I basically did a lot of the easy lifting on this project to achieve what I wanted, and now I have a system that works for me, but I left the heavy lifting of polishing it up and making it mainstream-deployable as an exercise to the reader. I would be delighted to find such a reader here. :-)
Judging by the helpful issue posts on that repo, @DemiMarie has also taken it for a spin, but I don't know of any other users / testers.
I actually haven’t. All of my suggestions have come from manual review of the code.
- Figuring out a stable update story that doesn't cause massive pain on a daily basis (hard).
The best approach I can think of is to ship a DKMS package that accompanies this package. Fedora 32 is EOL so I am not surprised you are having problem with its DKMS package.
- Reading up on the Qubes 4.0 -> 4.x changelogs and figuring out how the APIs are intended to be used. At the moment there's a number of things that can go wrong, and when they do go wrong, you need to manually stop&start
qubesd. Potentially this can be solved by restructuring the code to do the "dangerous" stuff in places wherequbesdpresently expects exceptions; potentially this will require adding sometry..excepttoqubes-core-admin. I felt figuring this out on my own without pestering the upstream Qubes devs was a hard task.
This is something that definitely needs to be worked on upstream. Writing a third-party storage driver should not be so difficult. I myself had a LOT of problems adding ephemeral volume encryption and ensuring that storage is cleaned up at system boot, so better documentation/cleaner APIs/etc would be amazing.
Feel free to pester me with any questions you have, so that I can improve the documentation.
- Finding a solution to some of the other problems, like the one where the COW updates to the template's storage (that are thrown away on next reboot anyway) are written to the template's storage instead of the ZFS storage. This is probably also hard without spending time with upstream devs, or a lot of time reading through the code and documentation.
This is also a problem with all of the other storage drivers, sadly.
@tlaurion I'm in about the same place as @cfcs as I use this driver every day, but havn't yet taken the time to clean it up. I did make a spec file, which you'll find in my fork. I did enough work to integrate building the zfs drivers within my qubes-builder setup, and I also build an updated version of zfs and zfs-dkms that hooks into my build pipeline. While I'm a package maintainer on other platforms, my knowledge of RPM isn't quite where it needs to be for me to be comfortable pushing anything upstream. I would like to find time to clean this up, though.
I would like to collaborate with ya'll to get this in a good state, so don't hesitate to push tasks my way.
By the way, I push my built packages to the following repo: https://repo.gpg.nz/qubes/r4.1. I can't make any guarantees, thouhg.
@DemiMarie
Judging by the helpful issue posts on that repo, @DemiMarie has also taken it for a spin, but I don't know of any other users / testers.
I actually haven’t. All of my suggestions have come from manual review of the code.
Ah, okay! Well, thanks for doing that!
- Figuring out a stable update story that doesn't cause massive pain on a daily basis (hard).
The best approach I can think of is to ship a DKMS package that accompanies this package. Fedora 32 is EOL so I am not surprised you are having problem with its DKMS package.
Yes, iirc I'm using fc34 or fc35, but the real issue is that kernel and kernel-sources and the DKMS modules are not updated in lock-step, and it would probably go a long way to have binary ZFS module packages instead of relying on compiling C code in each user's dom0.
- Reading up on the Qubes 4.0 -> 4.x changelogs and figuring out how the APIs are intended to be used. At the moment there's a number of things that can go wrong, and when they do go wrong, you need to manually stop&start
qubesd. Potentially this can be solved by restructuring the code to do the "dangerous" stuff in places wherequbesdpresently expects exceptions; potentially this will require adding sometry..excepttoqubes-core-admin. I felt figuring this out on my own without pestering the upstream Qubes devs was a hard task.This is something that definitely needs to be worked on upstream. Writing a third-party storage driver should not be so difficult. I myself had a LOT of problems adding ephemeral volume encryption and ensuring that storage is cleaned up at system boot, so better documentation/cleaner APIs/etc would be amazing.
Feel free to pester me with any questions you have, so that I can improve the documentation.
Thank you! I will take you up on that offer next time I nerdsnipe myself into working on this project.
- Finding a solution to some of the other problems, like the one where the COW updates to the template's storage (that are thrown away on next reboot anyway) are written to the template's storage instead of the ZFS storage. This is probably also hard without spending time with upstream devs, or a lot of time reading through the code and documentation.
This is also a problem with all of the other storage drivers, sadly.
That is great though, it means hopefully a solution can be found external to the people interested in ZFS integration?
@ayakael
I'm in about the same place as @cfcs as I use this driver every day, but havn't yet taken the time to clean it up. I did make a spec file, which you'll find in my fork. I did enough work to integrate building the zfs drivers within my
qubes-buildersetup, and I also build an updated version ofzfsandzfs-dkmsthat hooks into my build pipeline. While I'm a package maintainer on other platforms, my knowledge of RPM isn't quite where it needs to be for me to be comfortable pushing anything upstream. I would like to find time to clean this up, though.
Ah, that is cool. I think the best solution in terms of absolutely avoiding DKMS build errors would be to ship a custom dom0 kernel package that includes a ZFS module (potentially as a loadable module so it's not running if not needed). Is that what you do with your qubes-builder setup?
Thanks for chiming in, I was happy to learn that I have a user! :-)
@cfcs Indeed, the most fullproof solution is including zfs as a kernel module. That was my initial approach, but it was a lot of work maintaining a custom kernel, thus I moved to DKMS a while ago. My current appraoch is actually hooking in backported versions of zfs and zfs-dkms into qubes-builder which has avoided any DKMS issues. Thus, I do not rely on the out-of-date packages from Fedora's fc32 repo, but rather my own repo where the built zfs / zfs-dkms spec files (which were taken from up-to-date Fedora repos) are hosted.
Honestly I don't mind if this project doesn't deliver ZFS as part of the deliverables. I compile ZFS on my dom0s every time there is an update to either the kernel or ZFS (I run ZFS directly from master).
@cfcs: I would actually like to see this merged into Qubes core at some point, assuming @marmarek is okay with it. Even if we have problems shipping ZFS, I suspect that it should be possible to perform testing in CI.
@DemiMarie seconded. If CI needs to run integration tests, the zfs and zpool commands can totally be mocked with actual output from a system that had ZFS installed.
Honestly I don't mind if this project doesn't deliver ZFS as part of the deliverables. I compile ZFS on my dom0s every time there is an update to either the kernel or ZFS (I run ZFS directly from
master).
It wouldn't be that much work for Qubes to have a backported zfs and zfs-dkms package available for the current dom0 repo. I suspect most of the errors @cfcs encounters are due to having to adapt the old zfs-dkms package from Fedora's EOL repo to current kernel versions shipped by Qubes.
@ayakael what they're saying here is that they don't want to use the zfs-dkms package due to compilation in dom0 being necessary. I use the zfs-dkms package I roll from pristine upstream sources and that works very well for me.