zfs
zfs copied to clipboard
Chacha20-Poly1305 encryption
I’ve added a new encryption option to OpenZFS, chacha20-poly1305, which implements the Chacha20-Poly1305 AEAD described in RFC 8439. I’m interested in feedback, code review, testing and benchmarking from anyone interested in such things!
Important note: I have no particular skill in math, cryptography or secure software. I’m as confident as a reasonably good programmer can be that this is right, but I wouldn’t go trusting your secrets to this just yet!
Closes #8679.
Rationale
AES (and particularly the default AES-GCM mode used in OpenZFS) is known to be very slow on systems without hardware assistance. There are many such machines out there could make good use of OpenZFS, especially low-cost machines and small boards that would otherwise make very nice storage machines. The Raspberry Pi series of machines are a good example.
The best option for these systems is an encryption option that performs well in software. Chacha20-Poly1305 is the current “standard” option for this in many contexts, and is a good choice for OpenZFS.
More information about the rationale can be found in Google’s work on Adiantum (though their solution is necessarily different because unlike OpenZFS it operates on whole disk blocks).
Implementation
Core algorithms
I’ve taken the “core” Chacha20 and Poly1305 implementations from Loup Vaillant’s Monocypher. They are small and easy to read and understand, and the author has written extensively about its development and in particular the mistakes made along the way, which (at least to me) inspires confidence in the implementation.
I considered a number of other implementations, including those in Sodium and OpenSSL as well as a number of other smaller implementations. Despite the relative simplicity of the algorithms, there is lot of variability in the overall quality and usability of different implementations, so I opted for the most straightforward ones I could find.
ICP/KCF module
I’ve written a new KCF-style module for the ICP, that implements the AEAD (the “construction” that binds the cipher (ChaCha20) and authenticator (Poly1305) together) and the glue to make it available to OpenZFS proper.
Illumos does not currently support these algorithms, so it wasn’t possible to just take one from there. I’ve implemented a module, however it only implements what OpenZFS needs and is not sufficiently general-purpose to contribute back to Illumos. I have no plans to do this.
The rest of the module is as simple as I could figure out how to make, especially in some of the buffer management which is quite complex in the AES modes.
Tests
I’ve added a test program that runs the test vectors in RFC 8439. For Chacha20 and Poly1305 this is mostly a basic sanity check to assist in the future with possible alternate implementations (eg hardware-accelerated versions), as there are many implementations out there that have surprising quirks even when used correctly (most commonly endianness problems).
There’s also a full module test, checking that the AEAD is constructed correctly (again using the test vectors) and also makng sure the ICP module interface works.
The tests are only there to confirm basic operation; they have nothing to say about security properties or correctness.
Performance
I have only done very light performance tests to gain a rough idea of how much “better” or “worse” this is than aes-256-gcm on a handful of machines I have here. I do not know how to drive fio very well and I did not make much effort to control variables like other programs running on the machine, so I will not defend these numbers very hard. I’ve run them enough times to be reasonably confident they’re not wildly off the mark.
Method and fio output: https://gist.github.com/robn/4b40251ba84728cf8e669649ce77ad56
- Broadcom BCM2711 (Raspberry Pi 4B): ~2.4x faster (no AES hardware accel)
- Celeron J3455 (Fitlet2): ~1.3x faster (partial AES hardware accel)
- Core i7-8565U (Thinkpad X1 Gen7): ~1.2x slower (full AES hardware accel)
(ChaPoly in software coming so close to AES in hardware makes me think there’s probably still optimisation opportunities in the AES code, but I have not really considered this even a little bit).
Other notes
- ~This is only for Linux right now. It should be straightforward to implement for FreeBSD too, which has the necessary crypto primitives available in the kernel. I will try to add that soon.~ Now with FreeBSD support.
- ~Related to that, I have not fully updated the tests yet because I don’t want to them to break any automatic PR test stuff that might run on FreeBSD.~ Done.
- XChacha20 (the “better” Chacha variant) isn’t an option because OpenZFS cannot provide a large enough nonce to initialise it. In practice it wouldn’t matter because OpenZFS already takes steps to mitigate nonce reuse.
TODO
After discussion in monthly call:
- [ ] figure out if anyone else wants this feature
- [x] document when to choose chapoly vs AES
- [x] test chapoly dataset on older ZFS (failure modes)
- [x] get
encryptionproperty - [x] attempt to mount
- [ ] (other dataset ops?)
- [x] get
- [x] test send/recv raw send
- [x] both sides support
- [x] source supports, receiver doesn't
- [x] source doesn't support (?), receiver does
- [ ] neither support (?)
- [x] add feature flag
Future work
Just stuff I thought about along the way that I’m not committing to but might like to look at in the future.
- I want to at least try accelerated versions (read: assembly implementations of Chacha20 and/or Poly1305) of this in the future, to see if there are significant gains to be had. That will likely make a runtime selection mechanism necessary (like
icp_aes_implandicp_gcm_impl). - The entire ICP/KCF system has a lot of unused stuff and overhead that I’d like to see radically slimmed down. I think its probably possible lift
zio_crypt.cup to be common to all ports, with a much smaller layer that calls down to OS-provided cryptographic functions or uses the bundled ones where the system doesn’t have something available. That’s not just for Linux, but would help in (for example) Illumos, which has AES in the kernel but not ChaPoly. I’d probably wait to see what the Windows and Mac ports look like when they’re finally merged before considering this seriously.
Types of changes
- [ ] Bug fix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Performance enhancement (non-breaking change which improves efficiency)
- [ ] Code cleanup (non-breaking change which makes code smaller or more readable)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
- [ ] Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
- [x] Documentation (a change to man pages or other documentation)
Checklist:
- [x] My code follows the OpenZFS code style requirements.
- [x] I have updated the documentation accordingly.
- [x] I have read the contributing document.
- [x] I have added tests to cover my changes.
- [x] I have run the ZFS Test Suite with this change applied.
- [x] All commit messages are properly formatted and contain
Signed-off-by.
Last push has support for FreeBSD, which is very straightforward because all the crypto code is already available in the kernel.
Importantly, I've done a interop test: a pool with a chacha20-poly1305 dataset created on Linux can be imported and read and written to on FreeBSD, and back again. I expected that, but its nice to see it in action.
I appreciate very much you made it part of ICP, it means nearly no work needs to be done for me.
So, what's the next step for this? If its just waiting until volunteers have time to review, no problem, I get it! If there's something I could do or need to do though, I'd like to get on to that.
(Should I be dropping in on the monthly call to talk about it?)
So, what's the next step for this? If its just waiting until volunteers have time to review, no problem, I get it! If there's something I could do or need to do though, I'd like to get on to that.
Yes. There are a number of PRs waiting review and reviewers have limited bandwidth to look at them. Larger PRs like this take substantially longer to see review.
(Should I be dropping in on the monthly call to talk about it?)
That might help attract reviewers.
Hey there,
I must admit I'm a little bit confused here. Linux has already support for Chacha20Poly1305 under lib/ and relevant arch/, why do we need another implementation?
@evrim there's two ways to look at it. One is that Linux's crypto code is only available to GPL kernel modules, which OpenZFS is not. The other is that by doing it in the ICP, other ICP-using ports of OpenZFS (Windows and macOS) can use it.
@robn Some (#11357) of us are building ZFS and setting it as "GPL" module when we build, how can we then use the existing Linux built-in GPL implementations of chacha20 etc? thx
@jittygitty there's no code in ZFS right now that can call into the kernel crypto API, regardless of how you build it - you will always be using the ICP for crypto even for AES modes. You'd need a variant of zio_crypt for that, which is well outside the scope of this PR.
We talked about this at the 31 Jan call. Feedback was generally favourable, thanks all!
I was waiting to check my notes against the recording but that hasn't surfaced yet and I want to get moving on this a bit this week. So here's what I wrote down. If it turns out I got something wrong, I will of course adjust it.
- We agreed that it was fine to use Monocypher, under a more general principle that we don't really have the resources or skills to review crypto implementations ourselves, and trusting external implementations that appear to know what they're doing is a reasonable approach for us. The existence of a proof was welcomed (LoupVaillant/Monocypher#246).
- There should be wording in the manpage to give guidance about using AES vs chapoly that mostly boils down to "AES if you have hardware acceleration, chapoly otherwise".
- More testing is required. I didn't write down specifics, but I've been thinking about it, and this is at least around upgrade/downgrade from/to ZFS versions with/without chapoly support, and how send/receive copes with same. Possibly this will require a new feature (like we do with checksum & compression algorithms)
Finally, and most importantly, it was agreed that this is obviously a nice feature, but if no one else wants it then perhaps it we shouldn't be taking on the maintenance burden. I will soon go and try to get a rough show of hands to see if anyone would else would use this. Probably, a carefully-worded query to the mailing list, discussion forum and reddit, and go from there.
I'll return this to draft status for the moment, and add some additional todo items to the top post.
We talked about this at the 31 Jan call. Feedback was generally favourable, thanks all!
I was waiting to check my notes against the recording but that hasn't surfaced yet and I want to get moving on this a bit this week. So here's what I wrote down. If it turns out I got something wrong, I will of course adjust it.
- We agreed that it was fine to use Monocypher, under a more general principle that we don't really have the resources or skills to review crypto implementations ourselves, and trusting external implementations that appear to know what they're doing is a reasonable approach for us. The existence of a proof was welcomed (Add machine-checked proofs for limb overflows LoupVaillant/Monocypher#246).
- There should be wording in the manpage to give guidance about using AES vs chapoly that mostly boils down to "AES if you have hardware acceleration, chapoly otherwise".
- More testing is required. I didn't write down specifics, but I've been thinking about it, and this is at least around upgrade/downgrade from/to ZFS versions with/without chapoly support, and how send/receive copes with same. Possibly this will require a new feature (like we do with checksum & compression algorithms)
Finally, and most importantly, it was agreed that this is obviously a nice feature, but if no one else wants it then perhaps it we shouldn't be taking on the maintenance burden. I will soon go and try to get a rough show of hands to see if anyone would else would use this. Probably, a carefully-worded query to the mailing list, discussion forum and reddit, and go from there.
I'll return this to draft status for the moment, and add some additional todo items to the top post.
I find this feature usefull and would help with this PR if you want.
There is also a rfc7905 which update TLS / DTLS with ChaCha20-Poly1305 - so why should OpenZFS not go with the time and have this option.
I did all the testing I wanted to do with a view to adding a feature flag.
I created a new pool on Linux with master 620a977f2 + this PR, exported it, then rebuilt on master alone and imported the pool.
All the encryption properties are returned except for encryption itself:
root@fitlet:~# zfs get encryption,encryptionroot,keyformat,keylocation,keystatus sandisk
NAME PROPERTY VALUE SOURCE
sandisk encryption - -
sandisk encryptionroot sandisk -
sandisk keyformat passphrase -
sandisk keylocation file:///tmp/zfskey local
sandisk keystatus unavailable -
This appears to be because the underlying value is out bounds for this enum, so the default (ZIO_CRYPT_OFF) is returned.
Attempting to load the key fails, in a slightly weird way:
root@fitlet:~# zfs load-key sandisk
Key load error: Incorrect key provided for 'sandisk'.
and of course it can't be mounted:
root@fitlet:~# zfs mount sandisk
cannot mount 'sandisk': encryption key not loaded
I say it fails in a slightly weird way, because I think its just lucky this time.
If we're compiled with --enable-debug, it fails a lot earlier:
root@fitlet:~# zfs load-key sandisk
Message from syslogd@fitlet at Mar 4 20:25:08 ...
kernel:[1281777.454629] VERIFY3(crypt < ZIO_CRYPT_FUNCTIONS) failed (9 < 9)
Message from syslogd@fitlet at Mar 4 20:25:08 ...
kernel:[1281777.454642] PANIC at zio_crypt.c:568:zio_crypt_key_unwrap()
When asserts are disabled, it seems like its only doing ok because the crypto table entry does actually exist for crypt == 9, its the NULL entry at the end. Linux zio_crypt_key_unwrap sets up to decrypt the key and even tries, fails and gets an ECKSUM, which is then swallowed and EACCES is returned. On FreeBSD I think it will fall out a little earlier, at the top of zio_do_crypt_uio_opencrypto, and returns ENOTSUP, which again gets swallowed and replaced for EACCES.
I do think that behaviour is lucky. For the next new cipher suite crypt == 10, it'd be running off the end of the crypto table. Even now, on Linux, its trying to decrypt using only a partial state, so it could easily crash in other circumstances.
So probably this alone is enough to suggest a feature flag is warranted.
I did further send/receive testing as well.
When sender and receiver both support chapoly, it works exactly as it does for AES - raw send doesn't require the dataset loaded on the sender, not-raw does. All fine.
When there's no chapoly support, raw send still works. A stream is produced, and if that's later received to a chapoly-supporting pool, it goes in just fine, key can be loaded and it can't be mounted.
Importing a chapoly stream into a pool with no support fails:
root@fitlet:~# zfs recv -v sandisk/recv@snap < snap-from-chapoly.zfs
receiving full stream of sandisk@snap into sandisk/recv@snap
cannot receive new filesystem stream: invalid backup stream
Its not totally clear to me without looking inside that that's indicating. Presumably, invalid property of some sort. I'll check more closely tomorrow.
But again, likely the a feature flag is going to benefit.
I find this feature usefull and would help with this PR if you want.
@mcmilk sorry I missed this! Thanks! At this point I don't think there's much else to do other than as much code review as we can. The main thing I want to feel good about is that I'm not misusing the crypto stuff in some important way. General bugs would be bad but survivable, but the crypto being no good is worse than not having it at all. I think its probably fine, but if you've got good ideas that'd be great.
Ehh, I've got my brain in backwards this evening. A feature in the dataset isn't going to make an old implementation magically able to cope with it.
That said, I'll see if I can do something now to help for the next new crypto algorithm.
@robn - I personally find the choosen Monocypher variants of the code not so good. The public domain variant of e.g. tinyssh is a lot easier to read and also C99... but I didn't dig very deep in currently - so this opinion may change ;) But: we should take care, that assembly of maybe openssl or others could easily adopted as well - maybe with speed testings as done for fletcher, sha2 and other algos.
The SHA2 and BLAKE3 algos have a lot variants currently and make module loading a bit slow - because of their testings while loading. Maybe this should be done later, when the module is loaded, as some first job in background...
It would be nice, when other new algos also use this kind of interface, so that all this crypto / hashing stuff has some clean interaface within zfs.
A really cool thing would be, when we define some interface for interacting with crypto stuff provided by the operating systems. Of cause, linux won't provide us much ... but FreeBSD, MacOS and Windows. @behlendorf - should/could this be done with some interface within SPL?
These are my thoughts about the current state of Chacha-Poly1305.
I personally find the choosen Monocypher variants of the code not so good. The public domain variant of e.g. tinyssh is a lot easier to read
Hi @mcmilk, Monocypher author here. It looks like you are saying that:
- Monocypher is not very good?
- Tinyssh is much more readable?
First, you may want to withhold any claim that Monocypher isn't very good until you take a deeper look. See for instance this external security audit, which found zero bugs and noted that "the code is exceptionally clean". Second, I'm not sure why you would even compare Monocypher with tinyssh. Monocypher is a public facing library. Tinyssh's crypto code is as far as I can tell internal only, and as such incomplete, undocumented, and probably unsupported as well.
Now if this was your way of expressing personal preferences about coding style, sure. To each his own.
But: we should take care, that assembly of maybe openssl or others could easily adopted as well
A look at this patch seems to indicate this is already the case: as far as I can tell all calls to Monocypher are concentrated in a single file, which as far as I can tell is little more than a wrapper. The only precaution I would recommend beyond that is to perhaps avoid Monocypher's crypto_aead_lock() and instead use crypto_aead_init_ietf() and crypto_aead_write() so you're fully RFC 8439 compliant. (It's not a problem if you're already using Chacha20 and Poly1305 manually, though.)
maybe with speed testings
I've tested what I can compare, here are the results for tinyssh (note the absence of authenticated encryption):
Chacha20 : 396 megabytes per second
Poly1305 : 1158 megabytes per second
SHA-512 : 265 megabytes per second
x25519 : 3004 exchanges per second
EdDSA(sign) : 2168 signatures per second
EdDSA(check): 1095 checks per second
And the equivalent for Monocypher:
Chacha20 : 407 megabytes per second
Poly1305 : 869 megabytes per second
SHA-512 : 278 megabytes per second
x25519 : 7629 exchanges per second
EdDSA(sign) : 14904 signatures per second
EdDSA(check) : 5345 checks per second
Seems I can seriously improve the performance of poly1305. Will work on it for the next patch.
These are my thoughts about the current state of Chacha-Poly1305.
I'm not sure I follow… your comment talks about a lot more than just ChaPoly. Also, why did you link to the Wikipedia page? Is there something specific there you wanted to highlight?
maybe with speed testings
I've tested what I can compare, here are the results for tinyssh (note the absence of authenticated encryption):
Chacha20 : 396 megabytes per second Poly1305 : 1158 megabytes per second SHA-512 : 265 megabytes per second x25519 : 3004 exchanges per second EdDSA(sign) : 2168 signatures per second EdDSA(check): 1095 checks per secondAnd the equivalent for Monocypher:
Chacha20 : 407 megabytes per second Poly1305 : 869 megabytes per second SHA-512 : 278 megabytes per second x25519 : 7629 exchanges per second EdDSA(sign) : 14904 signatures per second EdDSA(check) : 5345 checks per secondSeems I can seriously improve the performance of poly1305. Will work on it for the next patch.
Awesome.
These are my thoughts about the current state of Chacha-Poly1305.
I'm not sure I follow… your comment talks about a lot more than just ChaPoly. Also, why did you link to the Wikipedia page? Is there something specific there you wanted to highlight?
I think he copy and pasted it to avoid typing it and GitHub inserted a link that he did not remove. It happens to me often when copy and pasting things, but I remove unwanted links from my comments.
... I have mainly these reasons for my current opinion:
- code comments via
//... we have/**/everywhere currently - the defines in the beginning are not mine:
FOR()and so on ... - optimizing - I would use some assembly of supercop or nacl by DJB, Samuel Neves and others
- for this to work we should have the algos seperated in poly1305.c and chacha20.c and some glue
Since your project can be used via CC0, we may choose some of your code also for the glue. @robn just wanted to implement it, so he used your lib, because it's easy to use. That is ok for now.
I'm not sure I follow… your comment talks about a lot more than just ChaPoly. Also, why did you link to the Wikipedia page? Is there something specific there you wanted to highlight?
We have filesystem developer here, not all here just know Chacha20-Poly1305 ... thats the reason why.
Alright, lets slow down a bit.
I spent a lot of time reviewing the available Chacha20 and Poly1305 implementations. I was coming at it from the perspective that we are a small project without specific cryptography experience, so I wanted an implementation that we (I) could read and feel confident was correct, and that we could easily benefit from updates and improvements made upstream.
I felt that this was important because I was (and still am) new to OpenZFS, and was not just bringing in new crypto code, but adding the first "new" crypto suite since encryption support was added, and also adding a module for the ICP not sourced from Illumos. I wanted a path of least resistance, because I felt like too much change at once would make it hard for others to feel confident about, and confidence is critical in cryptography code.
My reasons for choosing Monocypher:
- readable to a non-crypto person like me
- because readable, comparable to the algorithms as described in the papers
- an author able to communicate their design decisions and process, which I have found to be rare among crypto people. In particular, communicating failings
- an author responsive to feedback
I deliberately didn't make any changes other than trimming out the functions we're not using, because I wanted it to be very clear that it matches the upstream and also make it easy to update.
I wasn't clear originally, but alternate/optimised implementations were an explicit non-goal for this PR. Not that I think Monocypher is bad, but I wanted to start with a good quality generic implementation, close off any possible concerns, and then consider extending the work.
So. I'm not against changing implementations, nor suggesting changes or improvements for Monocypher (though that would likely be better done upstream). But, I really don't want to bite off more than I can chew too early. I want this PR to deliver a nice feature, with confidence that its fit for purpose, and contribute to a foundation for future improvement of our cryptography code.
Of course, if there's something here that is going to present a problem for OpenZFS in general, then lets talk about that. Hopefully though at least the tradeoffs I'm trying to make are clear.
PS @mcmilk I know you've got some ideas and done some work on a generic crypto interface; I'm interested in that too. I'd like to talk to you more about it; I just haven't had time to really think about it yet.
@mcmilk
- code comments via // ... we have /**/ everywhere currently
- the defines in the beginning are not mine: FOR() and so on ...
- optimizing - I would use some assembly of supercop or nacl by DJB, Samuel Neves and others
- for this to work we should have the algos seperated in poly1305.c and chacha20.c and some glue
Ah, I see… you're not considering Monocypher as an external dependency, you're making it part of OpenZFS itself. Those are two very different approaches.
The "depend on Monocypher" approach most likely doesn't care about the details of coding style. Diverging comment styles in particular are a complete non-issue. Separate projects, separate coding styles. Your only choice here would be which dependency you want to take on. If for instance you want speed you should take a look at Libsodium, which is 4 times as fast as Monocypher at authenticated encryption. Forget about including its source code directly though, this one is clearly meant to be compiled & packaged separately.
The "absorb Monocypher" approach would make your points (1), (2), and (4) valid: of course the project should have a uniform coding style. But then you'd be taking on a serious responsibility: even though you would start from well tested listings, you are effectively implementing your own crypto, which while possible comes with a whole set of responsibilities you might not be comfortable shouldering. If you want performance I would recommend you start from the Dolebeau implementation of Chacha20 (used in Libsodium), and the 64-bit Donna implementation of Poly1305.
(Note: I consider taking Monocypher and deleting unused code is a valid "depend on" approach. Just make sure the diff only shows deleted lines.)
@robn
I wasn't clear originally, but alternate/optimised implementations were an explicit non-goal for this PR. Not that I think Monocypher is bad, but I wanted to start with a good quality generic implementation, close off any possible concerns, and then consider extending the work.
Being a stepping stone towards faster implementations is actually one of Monocypher's goals, and I made some explicit effort to avoid vendor lock-in. Unless you're using Monocypher's fanciest features, switching to something else will be easy. This is for instance what Jason Donenfeld did with Wireguard: he initially used Monocypher to secure the updates on Windows, then switched to a faster, formally verified implementation.
Alright, lets slow down a bit.
@robn My recent comments were intended for @LoupVaillant after his reply to @mcmilk inspired me to skim the monocypher code for things that occurred to me as possibly being in need of improvement.
PS @mcmilk I know you've got some ideas and done some work on a generic crypto interface; I'm interested in that too. I'd like to talk to you more about it; I just haven't had time to really think about it yet.
@robn - Yes, we should meet together in a short online meeting. I will send you an email for that.
@mcmilk When you said:
"A really cool thing would be, when we define some interface for interacting with crypto stuff provided by the operating systems. Of cause, linux won't provide us much ... but FreeBSD, MacOS and Windows. @behlendorf - should/could this be done with some interface within SPL?" --mcmilk
Did you mean that IF/when we create some currently lacking interface such as @robn replied to me: "@jittygitty there's no code in ZFS right now that can call into the kernel crypto API, regardless of how you build it - you will always be using the ICP for crypto even for AES modes. You'd need a variant of zio_crypt for that, which is well outside the scope of this PR."
That on "linux won't provide us much" because of GPL-Only symbol restrictions? While BSD/Mac/Win would allow using their kernel implementations? Or did you mean something else was the issue on Linux? Could you please clarify? thx
That on "linux won't provide us much" because of GPL-Only symbol restrictions? While BSD/Mac/Win would allow using their kernel implementations? Or did you mean something else was the issue on Linux? Could you please clarify? thx
@jittygitty yes, that's what he's referring to there specifically. However, that is a larger unit of work that isn't directly relevant to this PR, and we were just agreeing to have a chat about it sometime. Please don't post another licensing theory message in this PR at least; its not relevant.
@robn Thanks for confirming on behalf of @mcmilk but my follow-up wasn't going to be anything about licensing (thoughts on that is at https://github.com/openzfs/zfs/discussions/14630 and #14555 ), but a rough estimate as to the difficulty or dev-hours in implementing such interface. Anway though, feel free to ignore the question, apologies for any inconvenience. peace
@robn Thanks for confirming on behalf of @mcmilk but my follow-up wasn't going to be anything about licensing (thoughts on that is at https://github.com/openzfs/zfs/discussions/14630 and #14555 ), but a rough estimate as to the difficulty or dev-hours in implementing such interface. Anway though, feel free to ignore the question, apologies for any inconvenience. peace
Writing shims does not take that long. Each shim likely could be written in less than a day. However, testing is another story. The explosion in different code paths would make testing a nightmare, especially since the kernel and userland libraries are different, so we could not rely on ztest to test the kernel encryption code from userland like we do now. Even if the testing issue was resolved, there would be significant surface area for bugs to occur due to changes made by the platform to the native encryption library underneath us.
We originally ported icp because of Linux's GPL symbol exports, but in hindsight, using ICP has likely prevented a number of bugs and made validation easier, so I would not recommend going down the path of using native encryption interfaces. It would be less terrible if the various kernel encryption libraries also ran in user space, but OpenSolaris/illumos/Solaris is the only one where they do, so that leaves us in the position where it is extremely difficult to do thorough testing if we were to use native encryption libraries.
We can reuse will written encryption code from OpenSSL and others with compatible licenses by porting it to icp, so there should be no benefit from using native encryption libraries.
Edit: It turns out that FreeBSD avoided icp in favor of its own encryption library. I had not been aware of this.
@ryao Thanks for the reply, I got your email, so I'll respond more in depth there. Yes, I agree ICP has some pros as you said, I've never been against it. I've always simply wanted for us to treat GPL-only exports as Copyleft-only and use when warranted and if/when someone wants to do a PR with 'option' to use that path.
I've nothing else to add here. It works, it seems to be fit for purpose, and I think sets a good base for further work. I commend it to the House.
Thanks for moving this forward! I think the approach you're taking here makes good sense. From an OpenZFS integration perspective there are only a couple remaining issues:
-
Register a new feature flag for Chacha20-Poly1305 and enable it for the relevant datasets. As you noticed above in a production build you get a bit lucky and just get a strange error, but in a debug build it ASSERTs immediately. We need to detect that incompatibility early with the feature flag.
-
For
zfs send/recvwhen adding a compression or checksum type we've added a newDMU_BACKUP_flag for it. That said, for encryption it looks to me like that's not going to be needed (which is good since we're out of bit for flags). For a normal send stream there's no issue since we're not sending anything encrypted. And for a raw send stream the receiving side will already validate that the key is for a known crypto suite (dsl_crypto_recv_raw() -> dsl_crypto_recv_key_check() -> dsl_crypto_recv_raw_key_check()). What still might be nice here is to look in to printing as useful an error as possible in theZFS_ERR_CRYPTO_NOTSUPcase. -
Look in to extending some of the encryption related ZTS test cases to use the new feature. Most of those test cases have "encrypt" in the file name and use
encryption=on. However, a few of them do test all the algorithms, for examplecli_root/zpool_create/zpool_create_crypt_combos.ksh. -
If you're convinced the CodeQL warnings are in fact false positives just let me know and I can mark those as false positives.
@ryao - I would like to optimize the needed crypto routines in the near future via some assembly within the icp module.