Jan Wassenberg
Jan Wassenberg
A quick follow-up: TableLookupBytes has the quirk of staying within 128-bit blocks. But the AVX-512 operations here support full permutes across all vector lanes, same as SVE and RVV. Should...
The error seems to be that we are using __fp16 although it is not supported by the compiler. The code governing this decision is `#if ((HWY_ARCH_ARM_A64 || (__ARM_FP & 2))...
An update here for the record, related work ongoing in #1017. @malaterre has updated our timer to use rdtime instead of rdcycle, though apparently newer Linux will once again allow...
Do I understand correctly that the issue is that we do have `getauxval`, but not the bit definitions macro for the V extension? If so, it seems that the patch...
Closing, please feel free to reopen if you'd like to discuss further :)
Thanks, Andreas, for opening the issue on the HWH repo. Would be nice if the article were published again. Can you help create a new one? That would work better...
Closing, feel free to reopen if you're interested :)
Closing, feel free to reopen if you'd like to continue the discussion.
Thank you @johnplatts for implementing these!
I'm updating the wishlist (now that we have one) with Div/Rem. @kfjahnke did you want to send a pull request for your atan2 implementation?