portable-snippets icon indicating copy to clipboard operation
portable-snippets copied to clipboard

Benchmarking program for unaligned accesses

Open nemequ opened this issue 8 years ago • 4 comments

We need a program to benchmark the different methods for the unaligned module. Shouldn't be difficult now that the clock module is in reasonable shape…

@Cyan4973, did you do anything special for that blog post, or just benchmarking xxhash? If the former, I don't suppose you still have the code sitting around somewhere (and would be willing to share it)?

nemequ avatar Mar 08 '17 21:03 nemequ

I guess I just used the internal benchmark module of xxhsum and lz4 (command -b).

Cyan4973 avatar Mar 08 '17 21:03 Cyan4973

If you're on Linux x86, you can also consider uarch-bench to test the "raw" performance of loads/stores of various size and misalignments, perhaps as a baseline to compare to the psnip versions. It measures all 64-byte alignments, and the results align with what we know from published performance and optimization manuals. In uses small snippets of asm for the actual test code, which is the only thing that would need to be ported to make it work on other archs.

travisdowns avatar Jul 30 '17 21:07 travisdowns

x86, and especially Linux, are pretty well tested. So is ARM; this issue is really more for more exoctic architectures and compilers.

uarch-bench looks very cool, though; could be useful for SIMDe.

nemequ avatar Jul 31 '17 19:07 nemequ

Right, it makes sense. I do want to support other mainstream archs on uarch bench, but that probably mostly just means ARM.

travisdowns avatar Aug 01 '17 02:08 travisdowns