zfs initramfs, native encryption: add support for user-supplied key retrieval mechanism

trafficstars

Describe the feature you would like to see added to OpenZFS

I have an out of band, noninteractive mechanism for supplying a zfs native encryption key to zfs load-key, with keylocation set to prompt. I would like the decrypt_fs() function in /contrib/initramfs/scripts/zfs to support calling it instead of actually prompting.

I'm willing to write the code if you'll consider a PR. (It would be very little code; see below.)

How will this feature improve OpenZFS?

In addition to covering cases like #6556, it would really allow the user to supply the key using completely arbitrary mechanisms ZFS can remain blissfully oblivious to. Some examples:

get it from DNS (presumably from a split-horizon domain that will only resolve when queried from the right client);
get it by using a special public key to ssh into a server, where sshd forces execution of a script that outputs the key;
wait for some other program, such as an interactive dropbear session, to write the key into a file;
derive it from the TOC of a specific audio CD, using some key derivation function ("my server boots to Vivaldi");
get it from a program that derives it from voice input to a microphone, or someone performing a specific jig on a dance game controller, or something equally esoteric.

Additional context

I think a good way to implement this would be to add a boolean-valued user property, say, com.sun:zfs-initramfs:user-decrypt-function. If this is set to true, then instead of prompting for the key, try to call decrypt_fs_user first and only fall back to prompting if that failed to load the key.

Also, in addition to /etc/zfs/zfs-functions, we'd source /etc/zfs/zfs-user-functions if it exists (which could then define decrypt_fs_user); and we'd also install that file in the initramfs, of course.

Aug 09 '22 19:08 akorn

As another OpenZFS user who already uses a custom key retrieval mechanism, having this officially supported would be nice.

I have a suggestion for an alternate way to implement this (that probably requires less changes to the code):

On Fedora, ZFS initramfs key loading is done with the zfs-load-key.sh script, which is part of the ZFS dracut module. I'd guess that for other distros that don't use dracut to generate their initramfs there is an equivalent initramfs hook for ZFS key loading somewhere.

You can modify this script to implement key loading however youd like, without the need to add a new dataset property. Personally, I modified mine to attempt to unseal a password-based key from the system's TPM and try to decrypt with it, and if that fails then fall back to asking for a user-input password. The annoying part is that any time a new zfs-dracut` version is installed this file gets overwritten, so you have to manually redo the modifications to it.

My suggestion would be to have a custom key loading script in the same directory called zfs-load-key-custom.sh (that isnt overwritten on zfs-dracut updates), and have zfs-load-key.sh check for it, if it exists run it to try to get the key, and if it fails proceed as it normally would. The change in the zfs-load-key.sh scrfipt would be to have the _load_key_cb function start something like this:

_load_key_cb() {
    dataset="$1"

    ENCRYPTIONROOT="$(zfs get -Ho value encryptionroot "${dataset}")"
    [ "${ENCRYPTIONROOT}" = "-" ] && return 0

    [ "$(zfs get -Ho value keystatus "${ENCRYPTIONROOT}")" = "unavailable" ] || return 0

    # if custom key loading script exists then run it
    if [ -f "${moddir}/zfs-load-key-custom.sh" ]; then
        "${moddir}/zfs-load-key-custom.sh"  "${ENCRYPTIONROOT}"

        # return 0 if it was successful, otherwise continue
        [ "$(zfs get -Ho value keystatus "${ENCRYPTIONROOT}")" = "unavailable" ] || return 0
    fi

    <...>

Aug 10 '22 15:08 jkool702

Actually the property is not necessary -- we can call the user function for all encrypted datasets and it can just punt on the ones it doesn't know how to open. I added the property out of an abundance of caution, to make absolutely sure the user wanted their function called, but the code change is even smaller if we call the function unconditionally.

I like the approach of sourcing the function (as opposed to running a script) better because it's more universal: the sourced function can have access to all the state and functions of the initramfs/scripts/zfs file, and it can still call an external script if it wants to.

I also agree it would be better to unify the initramfs and dracut approaches, but since I know very little about dracut, I can't volunteer to do it.

Aug 10 '22 16:08 akorn

I like the approach of sourcing the function (as opposed to running a script) better because it's more universal: the sourced function can have access to all the state and functions of the initramfs/scripts/zfs file, and it can still call an external script if it wants to.

I see the appeal in this, though you have to remember this code will be distributed to people who wont be intimately familiar with how to use it.

Id argue that you dont really need that extra flexibility for loading ZFS keys- you only really need the name of the dataset encryption root you are loading the key for (which I admittedly had originally forgotten to pass to the zfs-load-key-custom.sh function in my previous comment - its now been edited in).

Id also tend to argue that this flexibility comes at the cost of making it easier for someone who doesnt know exactly what they are doing to screw things up. In particular:

Using the correct function/vaariable names.

How do you inform users what they must name their custom key loading function. E.g., initramfs/scripts/zfs sources /etc/zfs/zfs-user-functions and then calls zfs_load_key_custom....how do users know that they must use a function of the name zfs_load_key_custom() {...} in /etc/zfs/zfs-user-functions ?

On a related note, how do they know what variable names to use within said function without going through the initramfs/scripts/zfs file in detail? E.g., how do they know that the dataset that is having its key loaded is in the variable ${ENCRYPTIONROOT})?

You cant include a generic template of this file in zfs packages as it would overwrite this file when the user updated zfs and they would lose their custom ZFS user functions. You could include a separate file at, say, /etc/zfs/zfs-user-function-templates strictly for templates that say what function names and variables to use, but this is much less straightforward for end users than "write a script to load ther key, the dataset name is in "$1"".

Not overwriting functions / variables that must stay unmodified to successfully load zfs in the initramfs

Sort of the opposite of point 1 - users not only need to know what function names to use, they need to know which they cant use. Say that in /etc/zfs/zfs-user-functions someone (outside of a function) has BOOTFS=zfs. When the file is sourced, this would (i think) overwrite the BOOTFS variable that has the root ZFS dataset name, which would in turn screw up everything related to mounting a ZFS root filesystem from the initramfs.

If I was writing something for just me to use Id probably go the "source a file" way too, since I know what I should/shouldnt do with it. But, if Im writing code that others will use (likely with little-to-no time spent figuring out how to properly use it), I try not to give people easy access to unknowingly screw up something that could make their system unable to boot in the name of having more flexibility and access that they really dont need in 99.9% of situations.

I also agree it would be better to unify the initramfs and dracut approaches,

To this end, Id agree that /etc/zfs/<...> is a better location to store the custom key loading script than in the dracut zfs module directory. perhaps something like /etc/zfs/conf.d/zfs-load-key-custom.sh?

Aug 11 '22 13:08 jkool702

I like the approach of sourcing the function (as opposed to running a script) better because it's more universal: the sourced function can have access to all the state and functions of the initramfs/scripts/zfs file, and it can still call an external script if it wants to.

I see the appeal in this, though you have to remember this code will be distributed to people who wont be intimately familiar with how to use it.

Nothing like reading the ~10 lines I added to the README. :) (See my PR#13762)

Id argue that you dont really need that extra flexibility for loading ZFS keys- you only really need the name of the dataset encryption root you are loading the key for (which I admittedly had originally forgotten to pass to the zfs-load-key-custom.sh function in my previous comment - its now been edited in).

Not now, maybe, but I can see this becoming a sort of template for user-defined hooks that could be called in various places from initramfs/scripts/zfs, if a need arises for such.

Id also tend to argue that this flexibility comes at the cost of making it easier for someone who doesnt know exactly what they are doing to screw things up. In particular:
1. Using the correct function/vaariable names.
How do you inform users what they must name their custom key loading function. E.g., initramfs/scripts/zfs sources /etc/zfs/zfs-user-functions and then calls zfs_load_key_custom....how do users know that they must use a function of the name zfs_load_key_custom() {...} in /etc/zfs/zfs-user-functions ?

I inform them via the README file -- the same place where they'd learn what to call their script. Only instead of a script, it's a shell function.

On a related note, how do they know what variable names to use within said function without going through the initramfs/scripts/zfs file in detail? E.g., how do they know that the dataset that is having its key loaded is in the variable ${ENCRYPTIONROOT})?

The README tells them. My current version has the following line:

Create /etc/zfs/zfs-user-functions, and define the decrypt_fs_user() shell function in it. It will be called with $1 set to the value of the encryptionroot attribute and $2 set to the value of the keylocation attribute for the fs to be unlocked.

You cant include a generic template of this file in zfs packages as it would overwrite this file when the user updated zfs and they would lose their custom ZFS user functions. You could include a separate file at, say, /etc/zfs/zfs-user-function-templates strictly for

I don't think there is a need to include such a template. In all likelihood only a very small number of people will want to use this mechanism, and if they're determined enough to write their own code for it, they can be expected to read ~10 lines of README that explains where to put it and how it will be called.

templates that say what function names and variables to use, but this is much less straightforward for end users than "write a script to load ther key, the dataset name is in "$1"".

I essentially have the same: only instead of "write a script and call it foo.sh", I have "write a function and save it in foo.sh".

2. Not overwriting functions / variables that must stay unmodified to successfully load zfs in the initramfs
Sort of the opposite of point 1 - users not only need to know what function names to use, they need to know which they cant use.

I am a strong believer in allowing people to shoot themselves in the foot if that is what they really want to do. I deliberately made it so zfs-user-functions is sourced after the distributed zfs-functions so that people who know what they are doing can override distribution functions completely. Of course, if they do, they're on their own.

Say that in /etc/zfs/zfs-user-functions someone (outside of a function) has BOOTFS=zfs. When the file is sourced, this would (i think) overwrite the BOOTFS variable that has the root ZFS dataset name, which would in turn screw up everything related to mounting a ZFS root filesystem from the initramfs.

Yes. If people break their systems by doing things they don't understand, they get to count and keep the pieces. :)

I wouldn't be averse to adding a cautionary note advising people to write a function that starts an external script and doesn't set any variables unless they know what they are doing; but I wouldn't want to wrest a powerful tool from the hands of savvy users just to protect the careless or ignorant.

But, again, we're talking about a niche mechanism here that only savvy users would likely be interested in exploiting at all.

If I was writing something for just me to use Id probably go the "source a file" way too, since I know what I should/shouldnt do with it. But, if Im writing code that others will use (likely with little-to-no time spent figuring out how to properly use it), I try not to give people easy access to unknowingly screw up something that could make their system unable to boot in the name of having more flexibility and access that they really dont need in 99.9% of situations.

I think we'll have to agree to disagree and just see what the project maintainers think of my pull request.

I also agree it would be better to unify the initramfs and dracut approaches,

To this end, Id agree that /etc/zfs/<...> is a better location to store the custom key loading script than in the dracut zfs module directory. perhaps something like /etc/zfs/conf.d/zfs-load-key-custom.sh?

Well, it's not exactly configuration, so I'm not sure about conf.d, but I don't have a strong opinion.

Aug 11 '22 14:08 akorn

I mean either way would work, its a subtle difference. Which is better is mostly a matter of personal opinion.

they can be expected to read ~10 lines of README that explain where to put it and how it will be called.

On one hand 10 lines isnt much. On the other, if someone relatively new to ZFS (and still learning about how it works) is trying to set this up I imagine those 10 lines will get lost among the thousands of lines of information they've recently read regarding ZFS.

Personally, Im just not a fan of having another set of rules to remember / have to look up if I change things. Passing the target name to a script is a pretty standard way to modify standard behavior (for example, the kernel post-install scripts at /etc/kernel/postinst.d do this)

I am a strong believer in allowing people to shoot themselves in the foot if that is what they really want to do.

As am I. The whole "the computer will do what you tell it to do, even if it isnt what you wanted it to do" idea is great, but "giving someone the control that would allow them to screw up their system" and "making it easy for someone to unintentionally use that power to screw up their system".

To return to your analogy: if someone wants to shoot themselves in the foot let them, but dont start handing out guns and taping targets to peoples feet and saying how they will win a prize if they get a bullseye.

I deliberately made it so zfs-user-functions is sourced after the distributed zfs-functions so that people who know what they are doing can override distribution functions completely. Of course, if they do, they're on their own.

But, again, we're talking about a niche mechanism here that only savvy users would likely be interested in exploiting at all.

These two ideas mixed together is what is tripping me up here.

I totally see the appeal in a generalized framework to allow users to override core zfs functions. I could see this as being really useful. But this sort of thing reaches waayyyyy deeper into the zfs code than key loading, and as such needs to be considered and tested in relation to a much broader part of the ZFS code. Giving users the ability to generally and easily override important core ZFS functions for the sake of a very niche set of users to customize a specific task that really doesnt require this level of control to accomplish without considering/testing how it affects the vast majority of users outside of that niche is....well....reckless.

A non-computing analogy: you see someone struggling to screw in a screw with their bare hands and you decide to help them by giving them a tool to use.

If you hand them a screwdriver (with the proper tip to match the screw head) they will likely find it helpful.
If you hand them a 100-tool swiss army knife with a page of usage instructions and a verbal warning that they really need to read the full instructions before use or they might hurt themselves, and let them know that a couple of the 100 tools in there should really be avoided since, if used, they might set their house on fire...well, they might not find that to be all that helpful.

Aug 12 '22 03:08 jkool702

they can be expected to read ~10 lines of README that explain where to put it and how it will be called.

On one hand 10 lines isnt much. On the other, if someone relatively new to ZFS (and still learning about how it works) is trying to set this up I imagine those 10 lines will get lost among the thousands of lines of information they've recently read regarding ZFS.

Come now. How realistic is the assumption that someone relatively new to zfs and still learning about how it works would go around experimenting with expert level features like implementing their own key loading mechanism?

(Also, what you really need to be knowledgeable about to use this correctly isn't ZFS -- it's shell scripting.)

Personally, Im just not a fan of having another set of rules to remember / have to look up if I change things. Passing the target name to a script is a pretty standard way to modify standard behavior (for example, the kernel post-install scripts at /etc/kernel/postinst.d do this)

Modifying a file that gets sourced by a shell script is also fairly standard (e.g. /etc/default). And, again, this isn't something you'd be needing to look up "all the time" -- it's something you'd use rarely, if ever, and you'd need to look up the name of the special script and its calling convention just the same. I really don't see a difference in kind.

I am a strong believer in allowing people to shoot themselves in the foot if that is what they really want to do.

To return to your analogy: if someone wants to shoot themselves in the foot let them, but dont start handing out guns and taping targets to peoples feet and saying how they will win a prize if they get a bullseye.

I don't think I'm doing that (please point me to the bit that corresponds to "saying how they will win a prize if they get a bullseye"). However, by allowing them to override functions in the zfs initramfs ecosystem, I'm allowing them to shoot themselves in the foot if they go out of their way to do it. If I make it so they can't override those functions, I'm playing nanny by trying to prevent them from shooting themselves in the foot, which is not something I want to do.

I deliberately made it so zfs-user-functions is sourced after the distributed zfs-functions so that people who know what they are doing can override distribution functions completely. Of course, if they do, they're on their own.

But, again, we're talking about a niche mechanism here that only savvy users would likely be interested in exploiting at all.

These two ideas mixed together is what is tripping me up here.

How? We're talking about "expert users" who want to exploit niche functionality. I'm trying to give them more power and flexibility to do so. I think this is the ideal combination -- it would be any other combination ("expert user" vs. "mainstream nanny software" or "newbie" vs. "mainstream software with dangerously sharp edges") that should be tripping you up.

I totally see the appeal in a generalized framework to allow users to override core zfs functions. I could see this as being really useful.

To be clear: these are not "core zfs functions". They're functions of the zfs-related initramfs logic, about as far removed from "core zfs" as you can be and still be related to zfs at all.

But this sort of thing reaches waayyyyy deeper into the zfs code than key loading, and as such needs to be considered and tested in relation to a much broader part of the ZFS code.

I think you're either confused as to the nature of my suggested change or grossly overstating both its magnitude and complexity (have you even looked at the PR?). For one thing, it doesn't reach into "the zfs code" at all, let alone deeply. I have no idea what "broader part of the ZFS code" you even mean, or how you would propose to perform this testing.

If a savvy user wants to, they can already redefine functions in contrib/initramfs/scripts/zfs -- this is what I'm currently doing, by monkey patching the file using sed at initramfs time. My proposed change just introduces a mechanism to do this more cleanly.

We're talking about open source software. Anyone can change it. Pretending that allowing people to do this is somehow new and reaches "deep into the code" is absurd.

Giving users the ability to generally and easily override important core ZFS functions for the sake of a very niche set of users to customize a specific task that really doesnt require this level of control to accomplish without considering/testing how it affects the vast majority of users outside of that niche is....well....reckless.

Those "important core functions" are a few POSIX shell functions related to importing and unlocking a zfs pool at boot time. We must have a very different idea of what "important core functions" are (I'd say that in the case of ZFS those are all in C and not in the contrib/ directory -- how can contrib/ be "core"?).

As for "easily", well, it's sill not something that you'd do completely by accident. Nobody other than the small number of users interested in the niche functionality would ever get near it.

A non-computing analogy: you see someone struggling to screw in a screw with their bare hands and you decide to help them by giving them a tool to use.

* If you hand them a screwdriver (with the proper tip to match the screw head) they will likely find it helpful.

* If you hand them a 100-tool swiss army knife with a page of usage instructions and a verbal warning that they really need to read the full instructions before use or they might hurt themselves, and let them know that a couple of the 100 tools in there should really be avoided since, if used, they might set their house on fire...well, they might not find that to be all that helpful.

I don't think this is a very useful analogy. What we rather have here is a case of an expert engine mechanic using limited tools to accomplish something complex in a roundabout way, and I'm instead giving them the specialized tool that gets the job done in the most efficient way, with essentially no learning curve (the "page of usage instructions" are 3 bullet points, and the "verbal warning" to avoid hurting themselves with the tool isn't really needed because they're an expert engine mechanic who is familiar with this class of tool from years of experience).

Aug 12 '22 07:08 akorn

After encountering the same use case yesterday evening, I hacked together a solution and started looking for a more accepted method. I was glad to see the issue already opened, though it appears to have stagnated. I have submitted a pull request for a solution I think is somewhat simpler: there's no org.openzfs:zfs-initramfs:user-decrypt-function variable to set, just a zfs-load-key-user file that either does or does not exist.

I also do not attempt to create the zfs-user-functions file; while I understand the purpose, I think there is simplicity and value in keeping a key-loading function separate. For one, it enables other packages to programmatically set the custom key loading function without the risk of overwriting other user-defined functions.

Apr 01 '23 15:04 niwamo

zfs zfs copied to clipboard

initramfs, native encryption: add support for user-supplied key retrieval mechanism

Describe the feature you would like to see added to OpenZFS

How will this feature improve OpenZFS?

Additional context

zfs
zfs copied to clipboard