fastscancount icon indicating copy to clipboard operation
fastscancount copied to clipboard

Leo's populate_hits_avx

Open lemire opened this issue 6 years ago • 1 comments

I took Leo's PR and reverted back the change in cache size, to isolate the effect of his new populate_hits_avx. (It is hard to reason about multiple changes at once.)

Before (master)...

Trial 1...

AVX2-based scancount
2.78173 cycles/element 
2.41643 instructions/cycles 
0.00283376 miss/element 
Elems per millisecond:
fastscancount_avx2: 1.26256e+06

Trial 2...

AVX2-based scancount
2.75127 cycles/element 
2.44319 instructions/cycles 
0.00283133 miss/element 
Elems per millisecond:
fastscancount_avx2: 1.26693e+06

After... (merging this PR)

Trial 1...

AVX2-based scancount
3.02782 cycles/element 
2.36966 instructions/cycles 
0.00274635 miss/element 
Elems per millisecond:
fastscancount_avx2: 1.17293e+06

Trial 2...

AVX2-based scancount
3.02123 cycles/element 
2.37482 instructions/cycles 
0.00278034 miss/element 
Elems per millisecond:
fastscancount_avx2: 1.17274e+06

As you can see, I observe a performance regression with this PR.

lemire avatar Sep 30 '19 15:09 lemire

cc @searchivarius

lemire avatar Sep 30 '19 15:09 lemire