On-the-fly encryption of ZFS send streams for unencrypted datasets
Describe the feature you would like to see added to OpenZFS
Sending an unencrypted dataset to an encrypted dataset currently requires the target to have its encryption key loaded. The downside is that while the key is loaded, the data can potentially be compromised. Although this time window can be kept small (load key, receive stream, unload key) there is still a vulnerability.
To prevent this, the source must be encrypted and sent raw to the target. In that case, the data remains encrypted throughout the entire process.
Would it be possible to encrypt the dataset on the fly on the sender side using the target's key, and still send it in raw form to the target?
If data can be encrypted on the fly on the receiver side, it seems reasonable that the same should be possible on the sender side.
How will this feature improve OpenZFS?
It would eliminate the vulnerability of having data exposed while the key is loaded when receiving an unencrypted dataset.
@scineram As a first responder, you gave a thumbs-down. Can you elaborate why you think this would be a bad idea?
I proposed an alternative idea in #17939
To perform the encryption on the sending side (essentially generating a zfs send -w from an unencrypted or differently encrypted source) we need some way to pass zfs send details of the encryption it needs to match, i.e- the encryption key (not just a passphrase), and algorithm, plus any extra details ZFS uses (input vector? I'm not sure).
There is kind of a precedent for this in the form of resuming a send (zfs send -t <receive_resume_token>) which first requires us to obtain a token from the receiving side (zfs get receive_resume_token <target>). Of course obtaining encryption state via zfs get wouldn't be safe, but the principle is the same.
For example, you might aim to run a command that would look something like:
zfs receive -E prompt <target> | zfs send -E prompt <source> | zfs receive <target>
In this case the -E option of zfs receive accepts a method for obtaining a key (same as for -o keylocation=<method>) but instead of loading the key fully it only uses it to generate and return an "encryption state" that can be passed to zfs send.
For this example, we're then passing that state directly into zfs send which also has an -E option telling it to expect an encryption state, and how. In this case we've specified prompt here as well so it expects to receive it via a prompt (or via stdin in this example).
With the encryption state received, zfs send can now produce an encrypted stream similar to zfs send -w except using the encryption format expected by the receiver, rather than whatever the local format is (unencrypted, or differently encrypted).
The big drawback as I see it is that the encryption state is far more dangerous to pass around than a passphrase, as a passphrase can be changed if compromised, but the encryption state needs to include the actual key used to encrypt data on disk, so if it were compromised it would allow data to be decrypted at rest on the receiver.
Feels like maybe there should be an extra security step, such as zfs receive -E requiring an SSL certificate (to asymmetrically encrypt the state) and zfs send -E likewise requiring the corresponding SSL key to decrypt it internally. But is that overkill when a user should be doing all of this via SSL anyway? It would however mean that the state could be handled more safely when not being passed in directly.
To perform the encryption on the sending side (essentially generating a zfs send -w from an unencrypted or differently encrypted source) we need some way to pass zfs send details of the encryption it needs to match, i.e- the encryption key (not just a passphrase), and algorithm, plus any extra details ZFS uses (input vector? I'm not sure).
True, if they don't match the target the operation should simply fail. If just the properties mismatch it can also give an error or ask you to forcefully overwrite them. The same thing currently happens with the feature flags. If the receiving end doesn't support the right flags, it also simply fails.
In this case the -E option of zfs receive accepts a method for obtaining a key (same as for -o keylocation=
) but instead of loading the key fully it only uses it to generate and return an "encryption state" that can be passed to zfs send.
Also a great idea. With this method you explicitly let the target and source negotiate if the conditions are right for the receival
The big drawback as I see it is that the encryption state is far more dangerous to pass around than a passphrase, as a passphrase can be changed if compromised, but the encryption state needs to include the actual key used to encrypt data on disk, so if it were compromised it would allow data to be decrypted at rest on the receiver.
But is that overkill when a user should be doing all of this via SSL anyway?
If your exchange tunnel (e.g. SSH) is safe, this risk is indeed acceptable IMHO. And of course, ZFS could warn for this.