Nathan Moinvaziri
Nathan Moinvaziri
@KungFuJesus on macOS, I use Visual Studio Code with CMake extension and Ninja. It works just as well as Xcode.
You may not need to even compile their code. Just give AI the original source, ask it to check for any optimizations not in ours, and then apply them to...
I have been looking at some of the changes in inftrees.c: ```diff diff --git a/inftrees.c b/inftrees.c index 5234fe7a..6a71aeb8 100644 --- a/inftrees.c +++ b/inftrees.c @@ -7,6 +7,14 @@ #include "zutil.h" #include...
@KungFuJesus I tried getting some of their inflate_fast changes working, but couldn't get it to work across all our tests and the stuff i could get working it didn't make...
There was some bugs wen using init crc in crc32_pclmulqdq... but there appears to be more.. :-(
Microsoft Surface Book 4 with AVX512 (although this tests only pclmulqdq, the base code is the same). Benchmarks are hard to get consistent on Windows, but I post them just...
I do have plans in a future PR to speed up the final Barrett reduction using method described in Mattermost.
You should split these into separate PRs
> I'm not sure how that compares to what you're using or what you were asking. I think my AI was cooked. Today, I pushed back on its estimate and...
I think if I were to implement SIMD, I would need a `functable` function.., right?