ndarray icon indicating copy to clipboard operation
ndarray copied to clipboard

Investigate sum_3_azip's performance

Open bluss opened this issue 5 years ago • 1 comments

benchmark sum_3_azip seems to perform abysmally compared with the equivalent sum_3_azip_fold, investigate why, and why the former doesn't autovectorize like the latter.

bluss avatar Nov 23 '18 22:11 bluss

The difference is minimal with current master and nightly:

test sum_3_azip                        ... bench:         825 ns/iter (+/- 130)
test sum_3_azip_fold                   ... bench:         820 ns/iter (+/- 105)

mati865 avatar Mar 06 '22 23:03 mati865