zfs
zfs copied to clipboard
RFC - SHA2 reworking and API for iterating over multiple implementations
This RFC is about replacing the solaris SHA2 implementation with a public domain variant of SHA2. Also a new generic implementation API is introduced and could be used for other algorithms like AES, Fletcher and so on...
Here are some benchmarks, which are done while loading the zfs module. By default the fastest impl. will be used.
Linux debian-x64 5.10.0-13-amd64 - VM on AMD Ryzen 5800X
implementation 1k 4k 16k 64k 256k 1m 4m 16m
edonr-generic 1940 2517 2709 2762 2761 2772 2793 2777
skein-generic 894 988 1015 1022 1030 1024 1027 1026
sha256-generic 319 342 349 350 349 350 351 351
sha256-x64 441 474 480 480 484 484 484 485
sha256-ssse3 497 590 620 626 630 629 627 627
sha256-avx 277 311 323 325 326 326 325 326
sha256-avx2 541 640 679 688 687 684 687 686
sha256-shani 1314 1915 2184 2260 2276 2282 2280 2283
sha512-generic 455 516 534 537 534 535 534 533
sha512-x64 643 723 758 763 763 762 763 765
sha512-avx 387 463 487 497 498 497 498 498
sha512-avx2 738 975 1058 1076 1084 1079 1083 1082
blake3-generic 313 320 315 317 317 317 317 314
blake3-sse2 348 1448 1646 2040 1920 2010 1820 1751
blake3-sse41 304 1807 2053 2165 2211 2248 2303 2494
blake3-avx2 384 2783 4617 5036 5013 4925 4860 4871
Linux debian-x64 5.10.0-13-amd64 - VM on Intel i3-1005G1
implementation 1k 4k 16k 64k 256k 1m 4m 16m
edonr-generic 1304 1649 1733 1754 1786 1774 1752 1741
skein-generic 552 603 615 616 617 620 617 613
sha256-generic 254 274 279 280 280 281 280 282
sha256-x64 289 310 317 318 318 317 318 317
sha256-ssse3 307 342 353 356 357 357 357 356
sha256-avx 312 347 360 362 361 362 362 362
sha256-avx2 332 378 392 394 396 396 394 396
sha256-shani 881 1113 1209 1222 1237 1236 1223 1228
sha512-generic 363 412 428 430 424 423 423 424
sha512-x64 421 474 493 494 493 496 495 495
sha512-avx 430 523 551 559 560 559 555 557
sha512-avx2 455 567 599 606 609 608 605 609
blake3-generic 329 323 323 323 325 283 283 314
blake3-sse2 384 1343 1426 1446 1452 1433 1389 1395
blake3-sse41 412 1539 1647 1693 1693 1660 1602 1616
blake3-avx2 409 1956 3148 3328 3301 3171 3008 3018
blake3-avx512 446 2792 5117 5815 5777 5539 4911 4927
AArch64 VM on my Mac Mini M1:
implementation 1k 4k 16k 64k 256k 1m 4m 16m
edonr-generic 2707 3044 3372 3397 3427 3347 3366 3385
skein-generic 556 581 589 592 595 591 589 589
sha256-generic 286 303 305 308 307 307 307 305
sha256-neon 358 370 379 380 381 381 380 380
sha256-armv8-ce 2164 2310 2344 2336 2304 2343 2338 2340
sha512-generic 431 477 485 489 489 490 476 490
sha512-armv8-ce 1198 1327 1368 1363 1372 1369 1335 1368
blake3-generic 597 592 594 584 585 589 588 588
blake3-sse2 435 1422 1507 1558 1558 1537 1446 1516
blake3-sse41 464 1518 1643 1663 1657 1650 1609 1610
TODO
The PR https://github.com/openzfs/zfs/pull/13649 can use additional hardware acceleration via FreeBSD drivers, it should also go into OpenZFS, but not within this pull request.
Description
The skeleton file module/icp/include/generic_impl.c can be used for iterating over different implementations of algorithms. It is used by SHA256, SHA512 and BLAKE3 currently.
The Solaris SHA2 implementation got replaced with public domain code of [ccpcrypto 0.10 (https://sourceforge.net/projects/cppcrypto/files/cppcrypto-0.10-src.zip/download).
These assembly files are taken from current openssl master:
- sha256-x86_64.S: x64, SSSE3, AVX, AVX2, SHA-NI (x86_64)
- sha512-x86_64.S: x64, AVX, AVX2 (x86_64)
- sha256-armv7.S: ARMv7, NEON, ARMv8-CE (arm)
- sha512-armv7.S: ARMv7, NEON (arm)
- sha256-armv8.S: ARMv7, NEON, ARMv8-CE (aarch64)
- sha512-armv8.S: ARMv7, ARMv8-CE (aarch64)
- sha256-ppc.S: Generic PPC64 LE/BE (ppc64)
- sha512-ppc.S: Generic PPC64 LE/BE (ppc64)
- sha256-p8.S: Power8 ISA Version 2.07 LE/BE (ppc64)
- sha512-p8.S: Power8 ISA Version 2.07 LE/BE (ppc64)
Signed-off-by: Tino Reichardt [email protected]
How Has This Been Tested?
Types of changes
- [ ] Bug fix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [x] Performance enhancement (non-breaking change which improves efficiency)
- [x] Code cleanup (non-breaking change which makes code smaller or more readable)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
- [ ] Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
- [x] Documentation (a change to man pages or other documentation)
Checklist:
- [x] My code follows the OpenZFS code style requirements.
- [ ] I have updated the documentation accordingly.
- [x] I have read the contributing document.
- [x] I have added tests to cover my changes.
- [x] I have run the ZFS Test Suite with this change applied.
- [x] All commit messages are properly formatted and contain
Signed-off-by.
I'll give these a spin on my various sundry machines later today or tomorrow.
Of course, it's important to include the silliest part, the example benchmarks.
Ryzen 5900X:
$ uname -r
5.18.0-0.bpo.1-amd64
$ cat /proc/spl/kstat/zfs/chksum_bench
15 0 0x01 -1 0 242790676536984 242792205359807
implementation 1k 4k 16k 64k 256k 1m 4m 16m
edonr-generic 973 1196 1239 1285 1280 1280 1280 1278
skein-generic 419 458 468 471 472 471 471 470
sha256-generic 192 206 209 210 211 211 211 211
sha256-ssse3 260 280 286 287 286 287 288 287
sha256-avx 137 145 147 148 148 148 148 148
sha256-avx2 284 305 312 313 313 313 313 313
sha256-shani 912 1002 1030 1037 1038 1038 1038 1038
sha512-generic 272 308 318 321 322 322 322 321
sha512-avx 195 217 224 226 226 226 226 226
sha512-avx2 412 470 489 494 496 495 495 494
blake3-generic 284 288 289 288 288 288 288 288
blake3-sse2 165 930 1010 1034 1030 1016 1017 1017
blake3-sse41 173 1029 1138 1171 1169 1152 1153 1151
blake3-avx2 172 1278 2099 2287 2268 2210 2223 2215
AArch64 VM on my Mac Mini M1:
$ uname -r
5.10.0-16-arm64
$ cat /proc/spl/kstat/zfs/chksum_bench
15 0 0x01 -1 0 1425811553429 1434441457433
implementation 1k 4k 16k 64k 256k 1m 4m 16m
edonr-generic 2707 3044 3372 3397 3427 3347 3366 3385
skein-generic 556 581 589 592 595 591 589 589
sha256-generic 286 303 305 308 307 307 307 305
sha256-neon 358 370 379 380 381 381 380 380
sha256-armv8-ce 2164 2310 2344 2336 2304 2343 2338 2340
sha512-generic 431 477 485 489 489 490 476 490
sha512-armv8-ce 1198 1327 1368 1363 1372 1369 1335 1368
blake3-generic 597 592 594 584 585 589 588 588
blake3-sse2 435 1422 1507 1558 1558 1537 1446 1516
blake3-sse41 464 1518 1643 1663 1657 1650 1609 1610
Sadly, ztest appears to be entirely broken, rather than merely sometimes unstable, on my testbed with this patch merged, and works fine without it:
~/zfs_vanilla_forreal $ ./ztest -VVVVV -GGG
choosing RAID type 'raidz'
5 vdevs, 7 datasets, 23 threads,4 raidz disks, 300 seconds...
Executing /home/rich/zfs_vanilla_forreal/zdb -bccsv -G -d -Y -e -y -p /tmp ztest
verifying concrete vdev 0, metaslab 14 of 15 ...
Verifying deleted livelist entries
Verifying metaslab entries
Dataset mos [META], ID 0, cr_txg 4, 89.8K, 56 objects
[etc]
~/zfs_newsha $ ./ztest -VVVVV -GGG
choosing RAID type 'raidz'
5 vdevs, 7 datasets, 23 threads,4 raidz disks, 300 seconds...
Executing /home/rich/zfs_newsha/zdb -bccsv -G -d -Y -e -y -p /tmp ztest
zdb: can't open 'ztest': No such device or address
ZFS_DBGMSG(zdb) START:
spa.c:6133:spa_import(): spa_import: importing ztest
spa_misc.c:416:spa_load_note(): spa_load(ztest, config trusted): LOADING
vdev.c:153:vdev_dbgmsg(): file vdev '/tmp/ztest.2a': vdev_validate: failed reading config for txg 18446744073709551615
vdev.c:153:vdev_dbgmsg(): file vdev '/tmp/ztest.0a': vdev_validate: failed reading config for txg 18446744073709551615
vdev.c:153:vdev_dbgmsg(): file vdev '/tmp/ztest.7a': vdev_validate: failed reading config for txg 18446744073709551615
vdev.c:153:vdev_dbgmsg(): file vdev '/tmp/ztest.4a': vdev_validate: failed reading config for txg 18446744073709551615
vdev.c:153:vdev_dbgmsg(): file vdev '/tmp/ztest.1a': vdev_validate: failed reading config for txg 18446744073709551615
vdev.c:153:vdev_dbgmsg(): file vdev '/tmp/ztest.6a': vdev_validate: failed reading config for txg 18446744073709551615
vdev.c:153:vdev_dbgmsg(): file vdev '/tmp/ztest.3a': vdev_validate: failed reading config for txg 18446744073709551615
vdev.c:153:vdev_dbgmsg(): file vdev '/tmp/ztest.5a': vdev_validate: failed reading config for txg 18446744073709551615
spa_misc.c:402:spa_load_failed(): spa_load(ztest, config untrusted): FAILED: cannot open vdev tree after invalidating some vdevs
vdev.c:205:vdev_dbgmsg_print_tree(): vdev 0: root, guid: 5616588175193895870, path: N/A, can't open
vdev.c:205:vdev_dbgmsg_print_tree(): vdev 0: mirror, guid: 3888918000877878788, path: N/A, can't open
vdev.c:205:vdev_dbgmsg_print_tree(): vdev 0: raidz, guid: 5965506435800278309, path: N/A, can't open
vdev.c:205:vdev_dbgmsg_print_tree(): vdev 0: file, guid: 18212454090720735338, path: /tmp/ztest.0a, can't open
vdev.c:205:vdev_dbgmsg_print_tree(): vdev 1: file, guid: 12178677185200324091, path: /tmp/ztest.1a, can't open
vdev.c:205:vdev_dbgmsg_print_tree(): vdev 2: file, guid: 6958433759899787561, path: /tmp/ztest.2a, can't open
vdev.c:205:vdev_dbgmsg_print_tree(): vdev 3: file, guid: 14252546041931747235, path: /tmp/ztest.3a, can't open
vdev.c:205:vdev_dbgmsg_print_tree(): vdev 1: raidz, guid: 8100377519981327636, path: N/A, can't open
vdev.c:205:vdev_dbgmsg_print_tree(): vdev 0: file, guid: 6727364101487571618, path: /tmp/ztest.4a, can't open
vdev.c:205:vdev_dbgmsg_print_tree(): vdev 1: file, guid: 2466317645343985470, path: /tmp/ztest.5a, can't open
vdev.c:205:vdev_dbgmsg_print_tree(): vdev 2: file, guid: 7668133104849780700, path: /tmp/ztest.6a, can't open
vdev.c:205:vdev_dbgmsg_print_tree(): vdev 3: file, guid: 11291587442423573644, path: /tmp/ztest.7a, can't open
spa_misc.c:416:spa_load_note(): spa_load(ztest, config untrusted): UNLOADING
ZFS_DBGMSG(zdb) END
ztest: '/home/rich/zfs_newsha/zdb -bccsv -G -d -Y -e -y -p /tmp ztest' exit code 1
ZFS_DBGMSG(ztest) START:
ZFS_DBGMSG(ztest) END
child exited with code 3
So that bodes ill. (This is on my Debian 11 system with a -backports kernel and --enable-debug, but it works/fails the same way on the AArch64 VM too.)
(I'm blaming ztest and not zdb because zdb from the unmodified tree fails the same way on the resulting vdev files.)
edit 2: well, it fails...DIFFERENTLY with --enable-asan...
ZFS_DBGMSG(zdb) START: |
spa.c:6133:spa_import(): spa_import: importing ztest |
spa_misc.c:416:spa_load_note(): spa_load(ztest, config trusted): LOADING |
vdev.c:153:vdev_dbgmsg(): file vdev '/tmp/ztest.0a': best uberblock found for spa ztest. txg 4 |
spa_misc.c:416:spa_load_note(): spa_load(ztest, config untrusted): using uberblock with txg=4 |
spa.c:8399:spa_async_request(): spa=ztest async request task=2048 |
spa_misc.c:416:spa_load_note(): spa_load(ztest, config trusted): LOADED |
spa.c:8399:spa_async_request(): spa=ztest async request task=32 |
ZFS_DBGMSG(zdb) END |
testing spa_freeze()... |
zio_crypt_key_init(crypt, &dck.dck_key) == 0 (0x5 == 0) |
ASSERT at module/zfs/dsl_crypt.c:2541:dsl_crypto_key_create_sync()/lib/x86_64-linux-gnu/libasan.so.6(+0x43df1)[0x7f07ca243df1] |
/home/rich/zfs_newsha/.libs/ztest(+0x27107)[0x5609017ee107] |
/lib/x86_64-linux-gnu/libpthread.so.0(+0x14140)[0x7f07caceb140] |
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x141)[0x7f07c9332ce1] |
/lib/x86_64-linux-gnu/libc.so.6(abort+0x123)[0x7f07c931c537] |
/home/rich/zfs_newsha/.libs/libzpool.so.5(+0x15fecc)[0x7f07c975fecc] |
/home/rich/zfs_newsha/.libs/libzpool.so.5(dsl_crypto_key_create_sync+0x498)[0x7f07c98d5308] |
/home/rich/zfs_newsha/.libs/libzpool.so.5(dsl_dataset_create_crypt_sync+0x572)[0x7f07c98d5a82] |
/home/rich/zfs_newsha/.libs/libzpool.so.5(dsl_dataset_create_sync_dd+0xd9e)[0x7f07c98eadde] |
/home/rich/zfs_newsha/.libs/libzpool.so.5(dsl_dataset_create_sync+0x552)[0x7f07c98ebbd2] |
/home/rich/zfs_newsha/.libs/libzpool.so.5(+0x26a66f)[0x7f07c986a66f] |
/home/rich/zfs_newsha/.libs/libzpool.so.5(dsl_sync_task_sync+0x204)[0x7f07c9948334] |
/home/rich/zfs_newsha/.libs/libzpool.so.5(dsl_pool_sync+0xb10)[0x7f07c9920a00] |
/home/rich/zfs_newsha/.libs/libzpool.so.5(spa_sync+0x13c4)[0x7f07c99ae5a4] |
/home/rich/zfs_newsha/.libs/libzpool.so.5(+0x3ecfcf)[0x7f07c99ecfcf] |
/home/rich/zfs_newsha/.libs/libzpool.so.5(+0x163892)[0x7f07c9763892] |
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8ea7)[0x7f07cacdfea7] |
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7f07c93f4def] |
|
ZFS_DBGMSG(ztest) START: |
spa.c:5210:spa_open_common(): spa_open_common: opening ztest |
spa_misc.c:416:spa_load_note(): spa_load(ztest, config trusted): LOADING |
vdev.c:153:vdev_dbgmsg(): file vdev '/tmp/ztest.0a': best uberblock found for spa ztest. txg 4 |
spa_misc.c:416:spa_load_note(): spa_load(ztest, config untrusted): using uberblock with txg=4 |
spa_misc.c:416:spa_load_note(): spa_load(ztest, config trusted): read 1 log space maps (1 total blocks - blksz = 131072 bytes) in 0 ms |
mmp.c:239:mmp_thread_start(): MMP thread started pool 'ztest' gethrtime 248827800761194 |
spa.c:8399:spa_async_request(): spa=ztest async request task=2048 |
spa_misc.c:416:spa_load_note(): spa_load(ztest, config trusted): LOADED |
spa_history.c:306:spa_history_log_sync(): txg 6 open pool version 5000; software version zfs-2.1.99-1329-g5a57a1f79; uts melusine 5.18.0-0.bpo.1-amd64 #1 SMP PREEMPT_DYNAMIC Debian 5.18.2-1~bpo11+1|
(2022-06-14) x86_64 |
ZFS_DBGMSG(ztest) END |
child died with signal 6 ```
I think that zdb and ztest are okay and I am missing sth.... I will take a closer look.
Because this PR replaces the whole SHA2 code, I would be happy if someone from the OpenZFS team could say sth. for or against it... I will invenstigate the failure, it makes sense.
I haven't dug into it, but based on the ASAN failure assertion, I was wondering if the native encryption stuff was calling into the SHA2 API somewhere in the ASM and broke from e.g. reordering the arguments.
edit:
$ ./tests/zfs-tests/tests/functional/hkdf/hkdf_test
TEST 0: HKDF failed with error code 5
Test failed.
So that seems to agree something is discontent in there...
On Mon, Aug 8, 2022, 11:59 AM Tino Reichardt @.***> wrote:
I think that zdb and ztest are okay and I am missing sth.... I will take a closer look. Because this PR replaces the whole SHA2 code, I would be happy if someone from the OpenZFS team could say sth. for or against it... I will invenstigate the failure, it makes sense.
— Reply to this email directly, view it on GitHub https://github.com/openzfs/zfs/pull/13741#issuecomment-1208311250, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABUI7MAOWY6FUZHL3IZTTLVYEVGDANCNFSM55W76FVQ . You are receiving this because you commented.Message ID: @.***>
@rincebrain - the patch is ready for real testing now. I will add armv4 and maybe some x86-64 assembly ... but the main thing is done. The last commit will be spitted to about 4 or 5 commits... until the weekend. I am searching my raspberry pi3 now ^^
Have you got some time for the EdonR thing?
I am currently working on the detection of the cpu feature checks for kernel space linux (freebsd and userspace should work).
Also offloading of SHA256 to FreeBSD's OCF is in progress (#13649) .... but if someone has interest for testing and inspecting the code - that would be nice.
This PR is ready for review.
The SHA256 offloading via FreeBSD's OCF should be done separately.
@mmatuska could you perform some testing on PPC? @BrainSlayer could you do some testing on AARCH64 and/or ARMv7 - or have some notes?
@mcmilk soon after i merged everything
Force pushed for testings with FreeBSD buildbot.
@mcmilk btw. you can expect that i havent found any issue if i didnt quote anything here. i tested it on armv7 so far and found no issues.
@mcmilk btw. you can expect that i havent found any issue if i didnt quote anything here. i tested it on armv7 so far and found no issues.
Thanks for reporting. I just rebased and made some small changes to the x64 assembly files - RET and ENDBR.
@behlendorf - do you think the patch set could go into ZFS some day? Is there some work to be done left?
When someone could do a review - this would be fine.
It would be nice to see this code merged!
It would be nice to see this code merged!
Yeah, FWIW I was testing this for a while (albeit on a boring x64 machine) and it behaved fine.
I will rebase the PR in march I think.
@mmaybee - some review would be really nice
Going over the code I have to say this looks great! Thanks for working on this, and sorry about the delayed review. I was only able to test this out on x86-64 but it worked well for me. I didn't encounter any issues and was able to verify the functional tests all passed. Looking at the PR comments I see others had a similar experience when testing it out on non-x86-64 architectures. Thank you to everybody who jumped in and helped test this PR out.
@mcmilk unless you have additional work planned for this, I'm happy to get this merged.
I added a new commit for using the SECTION_STATIC macro instead of .section .rodata within the x86-64 assembly.
This is the only fix which comes to my mind now. All the rest is stable, the DD-WRT project seems to use it for a while now.
@behlendorf - thanks a lot for your review.
Going over the code I have to say this looks great! Thanks for working on this, and sorry about the delayed review. I was only able to test this out on x86-64 but it worked well for me. I didn't encounter any issues and was able to verify the functional tests all passed. Looking at the PR comments I see others had a similar experience when testing it out on non-x86-64 architectures. Thank you to everybody who jumped in and helped test this PR out. @mcmilk unless you have additional work planned for this, I'm happy to get this merged.
I added a new commit for using the
SECTION_STATICmacro instead of.section .rodatawithin the x86-64 assembly. This is the only fix which comes to my mind now. All the rest is stable, the DD-WRT project seems to use it for a while now.@behlendorf - thanks a lot for your review.
yeah i do. but dont take it seriously. i dont think anybody is using zfs in dd-wrt except but me :-)
A bit unfortunate, it blows away the ASMABI work for the calling convention ABI.
A bit unfortunate, it blows away the
ASMABIwork for the calling convention ABI.
Is there really some problem with the SHA2 files of this PR within the Windows Port? If yes, I will help with fixes, when needed.
I used ENTRY_ALIGN(), SET_SIZE() and SECTION_STATIC ...
Give me a sec, just merging in and making sure its running.. it'll look something like:
https://github.com/openzfsonwindows/openzfs/commit/2934c88f5cdefae8d16bc350e800c4e4d4567fda
hrmm i need to pull out the kfpu_vars as a PR - so I can fix up in a separate PR
Give me a sec, just merging in and making sure its running.. it'll look something like:
hrmm i need to pull out the kfpu_vars as a PR - so I can fix up in a separate PR
Ah, I see - sorry for these. Maybe an extra PR just for these changes would be good - so they get in quickly. I would re-view this PR extra fast ;-)
Ignore previous messages, they were wrong.
The issue was actually that it uses the TF macro for most calls, to make wrappers then call into assembly;
#define TF(E, N) \
extern void ASMABI E(uint32_t s[8], const void *, size_t); \
static inline void N(uint32_t s[8], const void *d, size_t b) { \
kfpu_vars; kfpu_begin(); E(s, d, b); kfpu_end(); \
}
Except for the first type:
extern void zfs_sha256_transform_x64(uint32_t s[8], const void *, size_t);
const sha256_ops_t sha256_x64_impl = {
.is_supported = sha2_is_supported,
.transform = zfs_sha256_transform_x64,
which created a case where we do regular C calls to "wrappers" (which then do ASMABI calls out), except for zfs_sha256_transform_x64() which needed to be ASMABI directly. Making it used inconsistently.
If I change that function to also have a wrapper;
TF(zfs_sha256_transform_x64, tf_sha256_transform_x64);
const sha256_ops_t sha256_x64_impl = {
.is_supported = sha2_is_supported,
.transform = tf_sha256_transform_x64,
Then it is consistent. All wrappers are called in C, and all assembly are called with ASMABI.
I presume that zfs_sha256_transform_x64() was done differently as it does not need to call kfpu_begin() ?
If that is the case, I can:
- Make another macro like TF2, which doesn't call
kfpu_begin()and code is nice and clean - Make a manual wrapper just above it, not using TF macro.
- Use #ifdef _WIN32|APPLE and only make wrappers in for those platforms.
Also, the nested kfpu_begin() calls seem to have been cleaned up, so I can probably get rid of kfpu_vars
Hello @lundman - yes, the zfs_sha256_transform_x64() and zfs_sha512_transform_x64() functions use only regular non-SIMD code and do not need the kfpu() helper.... you should be able to use all these ossl SHA2 assembly directly in MacOS and Windows.
What we need in general, is some new SPL code to get crypto functions from FreeBSD, MacOS and Windows supported. I am thinking about this ... and will create some PR as draft when there is sth. ready. Maybe you had also this idea.
It appears that IBM POWER7 (Power ISA 2.06) has being mis-detected to use zfs_sha256_power8, resulted in a kernel oops upon module loading:
[1673143.127724] zfs: module license 'CDDL' taints kernel.
[1673143.127782] Disabling lock debugging due to kernel taint
[1673143.912577] Unrecoverable VMX/Altivec Unavailable Exception f20 at d000000006e53050
[1673143.912652] Oops: Unrecoverable VMX/Altivec Unavailable Exception, sig: 6 [#1]
[1673143.912713] SMP NR_CPUS=128 NUMA pSeries
[1673143.912766] Modules linked in: zfs(PO+) spl(O) nfnetlink rpcsec_gss_krb5 auth_rpcgss oid_registry nfsv4 nfs lockd sunrpc grace fscache usb_storage joydev st xt_tcpudp iptable_filter snd_usb_audio cp210x pl2303 snd_usbmidi_lib snd_hwdep usbserial snd_pcm snd_timer snd_rawmidi snd_seq_device snd soundcore evdev uio_pdrv_genirq uio fuse drm configfs ip_tables x_tables autofs4 ext4 crc16 jbd2 mbcache crc32c_generic btrfs xor raid6_pq sg sr_mod cdrom sd_mod ses enclosure hid_generic usbhid hid ohci_pci ohci_hcd ehci_pci ehci_hcd ipr igb ptp pps_core usbcore usb_common libata ehea
[1673143.913693] CPU: 0 PID: 8725 Comm: insmod Tainted: P O 4.1.42-rivoreo-powerpc64-largepage #20
[1673143.913779] task: c0000002eefe5970 ti: c00000062d77c000 task.ti: c00000062d77c000
[1673143.913841] NIP: d000000006e53050 LR: d000000006c5e700 CTR: d000000006c5e6f0
[1673143.913901] REGS: c00000062d77ed90 TRAP: 0f20 Tainted: P O (4.1.42-rivoreo-powerpc64-largepage)
[1673143.913990] MSR: 8000000000009032 <SF,EE,ME,IR,DR,RI> CR: 44002424 XER: 00000000
[1673143.914122] CFAR: d000000006c5e6fc SOFTE: 1
[1673143.914122] GPR00: d000000006c5e480 c00000062d77f010 d000000006f696a0 c00000062d77f490
[1673143.914122] GPR04: d00000000d040000 0000000000000010 0000000000000000 0000000000000001
[1673143.914122] GPR08: d000000006c5e700 d000000006f418b8 00000000000000cf 00000000000000df
[1673143.914122] GPR12: 0000000024008424 c00000000ee40000 0000000000000000 c000000000872a40
[1673143.914122] GPR16: c00000062d77fc58 d00000000403d350 0000000000000124 0000000000000000
[1673143.914122] GPR20: c000000000617d50 0005f1b71bd88c36 00000000000f423f c00000062d77f4c0
[1673143.914122] GPR24: d00000000d040400 c00000062d77f490 d000000006e89300 0000000000000400
[1673143.914122] GPR28: 0000000000002000 0000000000002000 0000000000000000 0000000000000000
[1673143.915022] NIP [d000000006e53050] .zfs_sha256_power8+0x10/0xec0 [zfs]
[1673143.915231] LR [d000000006c5e700] .tf_sha256_power8+0x10/0x24 [zfs]
[1673143.915285] Call Trace:
[1673143.915315] [c00000062d77f010] [c00000000084dd00] 0xc00000000084dd00 (unreliable)
[1673143.915537] [c00000062d77f190] [d000000006c5e6dc] .tf_sha256_ppc+0x10/0x24 [zfs]
[1673143.915751] [c00000062d77f200] [d000000006c5e480] .SHA2Update+0x12c/0x19c [zfs]
[1673143.915982] [c00000062d77f2c0] [d000000006da1050] .sha_incremental+0x24/0x3c [zfs]
[1673143.916207] [c00000062d77f330] [d000000006d1a1c0] .abd_iterate_func+0xec/0x1a0 [zfs]
[1673143.916441] [c00000062d77f420] [d000000006da10ac] .abd_checksum_sha256+0x44/0x80 [zfs]
[1673143.916673] [c00000062d77f5b0] [d000000006dfa3a0] .chksum_run.isra.0+0xbc/0x114 [zfs]
[1673143.916932] [c00000062d77f6b0] [d000000006dfa480] .chksum_benchit+0x88/0x1c8 [zfs]
[1673143.917185] [c00000062d77f740] [d000000006dfa810] .chksum_init+0x250/0x574 [zfs]
[1673143.917440] [c00000062d77f810] [d000000006dbd01c] .spa_init+0x160/0x210 [zfs]
[1673143.917690] [c00000062d77f8b0] [d000000006e0a960] .zfs_kmod_init+0x28/0xed0 [zfs]
[1673143.917941] [c00000062d77f970] [d000000006e5b664] .openzfs_init+0x48/0xfc [zfs]
[1673143.918029] [c00000062d77f9f0] [c00000000000ae88] .do_one_initcall+0x140/0x1dc
[1673143.918114] [c00000062d77fae0] [c0000000005c8634] .do_init_module+0x70/0x1e4
[1673143.918199] [c00000062d77fb80] [c000000000113820] .load_module+0x1b88/0x2054
[1673143.918282] [c00000062d77fd30] [c000000000113e5c] .SyS_finit_module+0x6c/0x74
[1673143.918367] [c00000062d77fe30] [c000000000009360] system_call+0x38/0xd8
[1673143.918446] Instruction dump:
[1673143.918499] 60000000 60000000 60000000 60000000 60000000 60000000 60000000 60000000
[1673143.918638] f821fe81 7d0802a6 394000cf 396000df <7f0a09ce> 394a0020 7d8042a6 7f2b09ce
[1673143.921799] ---[ end trace 2f19b717bc9d3c92 ]---
[1673143.926650]
@Low-power - can you modify the zfs_sha256_power8() macro ?
I did the ppc testings only in userspace on machines of the gcc farm.
But I got fresh access to ppc machines here: https://osuosl.org/services/powerdev/ ... but I am not ready for testings there.
Edit: the this macro checks for power8: zfs_isa207_available() .... Power7 (2.06) should fail normally.
Edit2: my fault :/ --> the zfs_isa207_available() was not used
Can you try this patch:
diff --git a/module/icp/algs/sha2/sha256_impl.c b/module/icp/algs/sha2/sha256_impl.c
index 024cfb1e4..63f181d2e 100644
--- a/module/icp/algs/sha2/sha256_impl.c
+++ b/module/icp/algs/sha2/sha256_impl.c
@@ -141,9 +141,9 @@ const sha256_ops_t sha256_armv8_impl = {
};
#elif defined(__PPC64__)
-static boolean_t sha256_have_vsx(void)
+static boolean_t sha256_have_isa207(void)
{
- return (kfpu_allowed() && zfs_vsx_available());
+ return (kfpu_allowed() && zfs_isa207_available());
}
TF(zfs_sha256_ppc, tf_sha256_ppc);
@@ -155,7 +155,7 @@ const sha256_ops_t sha256_ppc_impl = {
TF(zfs_sha256_power8, tf_sha256_power8);
const sha256_ops_t sha256_power8_impl = {
- .is_supported = sha256_have_vsx,
+ .is_supported = sha256_have_isa207,
.transform = tf_sha256_power8,
.name = "power8"
};
I would provide an official PR when you say the kernel oops is away.
I would provide an official PR when you say the kernel oops is away.
Thanks for the patch, I will try it after a fresh reboot on next Monday.
I just tested it; the fix however is incomplete, icp/algs/sha2/sha512_impl.c needs a similar fix too.
[ 517.748405] zfs: module license 'CDDL' taints kernel.
[ 517.748477] Disabling lock debugging due to kernel taint
[ 518.885693] Unrecoverable VMX/Altivec Unavailable Exception f20 at d000000008d43f10
[ 518.885782] Oops: Unrecoverable VMX/Altivec Unavailable Exception, sig: 6 [#1]
[ 518.885857] SMP NR_CPUS=128 NUMA pSeries
[ 518.885955] Modules linked in: zfs(PO+) spl(O) cpufreq_stats bridge stp llc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc pl2303 snd_usb_audio snd_usbmidi_lib snd_hwdep cp210x snd_pcm snd_timer snd_rawmidi snd_seq_device usbserial snd soundcore evdev uio_pdrv_genirq uio xt_tcpudp iptable_filter fuse ip_tables x_tables autofs4 ext4 crc16 jbd2 mbcache crc32c_generic btrfs xor raid6_pq sg sr_mod sd_mod ses cdrom enclosure hid_generic usbhid hid ohci_pci ohci_hcd ehci_pci ehci_hcd igb ptp pps_core usbcore ipr usb_common libata ehea
[ 518.887859] CPU: 0 PID: 6499 Comm: insmod Tainted: P O 4.1.42-rivoreo-powerpc64-largepage #22
[ 518.887951] task: c0000007b33c2670 ti: c0000007ad480000 task.ti: c0000007ad480000
[ 518.888028] NIP: d000000008d43f10 LR: d000000008b4eccc CTR: d000000008b4ecbc
[ 518.888101] REGS: c0000007ad482db0 TRAP: 0f20 Tainted: P O (4.1.42-rivoreo-powerpc64-largepage)
[ 518.888194] MSR: 8000000000009032 <SF,EE,ME,IR,DR,RI> CR: 44002424 XER: 00000000
[ 518.888474] CFAR: d000000008b4ecc8 SOFTE: 1
GPR00: d000000008b4df04 c0000007ad483030 d000000008e596a0 c0000007ad4834b0
GPR04: d00000000bb40000 0000000000000008 0000000000000000 0000000000000001
GPR08: d000000008b4eccc d000000008e31a08 00000000000000cf 00000000000000df
GPR12: 0000000024008424 c00000000ee40000 0000000000000000 c000000000872a40
GPR16: c0000007ad483c58 d0000000084ed350 0000000000000124 0000000000000000
GPR20: c000000000617d50 00000078bd68f26d 00000000000f423f d000000008d79398
GPR24: 0000000000000400 c0000007ad483500 0000000000002000 d00000000bb40400
GPR28: c0000007ad4834b0 0000000000002000 0000000000000000 0000000000000000
[ 518.889999] NIP [d000000008d43f10] .zfs_sha512_power8+0x10/0xfc0 [zfs]
[ 518.890210] LR [d000000008b4eccc] .tf_sha512_power8+0x10/0x24 [zfs]
[ 518.890279] Call Trace:
[ 518.890323] [c0000007ad483030] [c0000007ad4830d0] 0xc0000007ad4830d0 (unreliable)
[ 518.890438] [c0000007ad4831b0] [c00000000084dd00] 0xc00000000084dd00
[ 518.890662] [c0000007ad483220] [d000000008b4df04] .sha512_update+0xd4/0x108 [zfs]
[ 518.890911] [c0000007ad4832e0] [d000000008c91050] .sha_incremental+0x24/0x3c [zfs]
[ 518.891153] [c0000007ad483350] [d000000008c0a1c0] .abd_iterate_func+0xec/0x1a0 [zfs]
[ 518.891403] [c0000007ad483440] [d000000008c9112c] .abd_checksum_sha512_native+0x44/0x60 [zfs]
[ 518.891657] [c0000007ad4835b0] [d000000008cea3a0] .chksum_run.isra.0+0xbc/0x114 [zfs]
[ 518.891913] [c0000007ad4836b0] [d000000008cea480] .chksum_benchit+0x88/0x1c8 [zfs]
[ 518.892160] [c0000007ad483740] [d000000008cea934] .chksum_init+0x374/0x574 [zfs]
[ 518.892408] [c0000007ad483810] [d000000008cad01c] .spa_init+0x160/0x210 [zfs]
[ 518.892653] [c0000007ad4838b0] [d000000008cfa960] .zfs_kmod_init+0x28/0xed0 [zfs]
[ 518.892897] [c0000007ad483970] [d000000008d4b664] .openzfs_init+0x48/0xfc [zfs]
[ 518.892992] [c0000007ad4839f0] [c00000000000ae88] .do_one_initcall+0x140/0x1dc
[ 518.893083] [c0000007ad483ae0] [c0000000005c8634] .do_init_module+0x70/0x1e4
[ 518.893173] [c0000007ad483b80] [c000000000113820] .load_module+0x1b88/0x2054
[ 518.893263] [c0000007ad483d30] [c000000000113e5c] .SyS_finit_module+0x6c/0x74
[ 518.893357] [c0000007ad483e30] [c000000000009360] system_call+0x38/0xd8
[ 518.893442] Instruction dump:
[ 518.893528] 00010203 04050607 10111213 10111213 00010203 04050607 08090a0b 10111213
[ 518.893804] f821fe81 7d0802a6 394000cf 396000df <7f0a09ce> 394a0020 7d8042a6 7f2b09ce
[ 518.894081] ---[ end trace f273ebecb440dc69 ]---
OK, so changes to compile on macOS in shared files:
https://github.com/openzfsonosx/openzfs-fork/commit/256294a5c163ed3866ab98e51c79b17405095715 and changes to macOS files:
https://github.com/openzfsonosx/openzfs-fork/commit/2f12e628e72fb65efe85e9b16d2dc487f789d1a2
adding aesv8
https://github.com/openzfsonosx/openzfs-fork/commit/ae7a8aca06ace75fe37ab4a13a300b5935c645d2