zfs icon indicating copy to clipboard operation
zfs copied to clipboard

WIP: Implement OpenSSL's AES GCM acceleration for armv8

Open w0xel opened this issue 1 year ago • 2 comments

This is WIP still (but it works), and if you agree I'd like to split this PR up into smaller PRs next before merging:

  • [ ] Reenable kfpu_allowed() for newer kernels to make use of NEON registers
  • [ ] Add the single-block AES implementation and key expansion from OpenSSL
  • [ ] Restructure multi-block AVX AES GCM codepath slightly to allow for sharing code with ARM codepath
  • [ ] (This PR): Implement OpenSSL's AES-GCM acceleration

Important TODO (putting this high up here so I don't forget):

  • [ ] Either fix or drop 128bit and 192bit kernels, they are untouched currently and will break when used.

I'll post some benchmarks on #12171 next.

Motivation and Context

Encrypted ZFS is currently very slow on aarch64 devices. This due to missing hardware acceleration support: #12171 Also the use of NEON is disabled on newer kernels: #14555

Description

Assembly generated from these files in the OpenSSL source:

(ASM diffs at time of creation, will try to update those)

with slight modifications to support the ICP interface.

For the chunked AES-GCM codepath the AVX code was reused by renaming avx -> hardware.

  • This is a lot of changes right now, but I've preferred this over copy-pasting the code, as it was resuable basically 1-to-1
  • If you have a better idea of how to abstract/structure this let me know :)

Kernel FPU support was re-added by naively storing/restoring vector registers.

  • This is basically the same as the x86_64 code, but a bit easier
  • AFAIK armv8 has no alignment requirements here, am I wrong?
  • We should IMHO disable this codepath on RT kernels. Same for x86.

And a bunch of support/wiring code around that.

How Has This Been Tested?

Definitely not well enough yet, do not merge yet.

The code was tested on a Raspberry Pi 5. I've created an encrypted fs, benchmarked it, swapped AES implementations, benchmarked it again and vice versa.

I did not test the changes on x86_64 yet, which is also affected by this PR.

Types of changes

  • [ ] Bug fix (non-breaking change which fixes an issue)
  • [x] Performance enhancement (non-breaking change which improves efficiency)
  • [ ] Code cleanup (non-breaking change which makes code smaller or more readable)
  • [ ] Breaking change (fix or feature that would cause existing functionality to change)
  • [ ] Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • [ ] Documentation (a change to man pages or other documentation)

Checklist:

  • [ ] My code follows the OpenZFS code style requirements.
  • [ ] I have updated the documentation accordingly.
  • [ ] I have read the contributing document.
  • [ ] I have added tests to cover my changes.
  • [ ] I have run the ZFS Test Suite with this change applied.
  • [ ] All commit messages are properly formatted and contain Signed-off-by.

w0xel avatar Oct 03 '24 13:10 w0xel

Some more info in this comment: https://github.com/openzfs/zfs/issues/12171#issuecomment-2391541274

w0xel avatar Oct 03 '24 14:10 w0xel

awesome, this make encryption usable on rk3588 :-)

leelists avatar Oct 16 '24 09:10 leelists

In a first view it looks really good. I will review it deeper on one or two weeks. Thanks a lot for this PR.

mcmilk avatar Oct 24 '24 20:10 mcmilk

It would be clearer if you could split into separate commits, e.g.:

  • one that implements & enables kfpu on aarch64 (so that current neon kernels for hashing and encryption would work again)
  • one that port some kernel files from openssl project (that would be ~8000 LOC)
  • one that enables these kernels and adds the dispatching

Edit: Sorry I missed your description that says you would split PRs, which exactly matched my thought. Please do!

Harry-Chen avatar Oct 31 '24 06:10 Harry-Chen

Hi w0xel, any news on this PR ?

leelists avatar Jan 30 '25 13:01 leelists