Christopher Sidebottom
Christopher Sidebottom
> Looks like non-MSVC compilers are good so far, MSVC requires an extra line at the end of the test, added comment :) Argh, sorry, I did look at the...
@alenkacz any update on this? Looks like it was good to merge last year?
> This is another patch demonstrating how the current NumPy SIMD code could be converted to Highway, similar to #25781. All tests pass on my local AVX512 and AVX2 machine....
cc @jan-wassenberg because the review box doesn't seem to want me to request review that way 🙀
Thanks for the effort here @sterrettm2. I benchmarked this and got some troubling results: ``` | Change | Before [f3cca787] | After [db31dce1] | Ratio | Benchmark (Parameter) | |----------|----------------------------|---------------------|---------|-----------------------------------------------------------------------------------|...
@jan-wassenberg this looks like we might have a regression in Highway, rolling back to https://github.com/google/highway/commit/f5258670685efb8f4e4e74be38e20a738e6104a9 results in the gather performance improving whereas with current `HEAD` I'm seeing the 3x slower...
@jan-wassenberg see https://github.com/google/highway/issues/2382
Overall, this seems good; I'll try to bump the highway version once the fix has landed for AArch64
Thanks @sterrettm2!
Hi @dnoan, That is unfortunate, what's interesting is that many of these are `N=1`, which means they should go through GEMV, not GEMM: https://github.com/OpenMathLib/OpenBLAS/blob/8483a71169bac112db133d45b39d4def812f81b6/interface/gemm.c#L501-L545 Whilst the patch you're indicating is...