quick-bench-front-end
quick-bench-front-end copied to clipboard
Feature request - support for passing compiler options
Thanks for the great project. It would be great if we could pass compiler options for our benchmark like gcc.godbolt.org. Right now I want to run my benchmark with -mavx2 and I can't find a way to do that.
Hi, I thought about it, but the problem with passing compiler options like that is that it would require some serious sanitizing, to avoid injections. I am not very clear on how I could do it. I will have a look at godbolt's solution and see if I can use it.
Yeah, looking at godbolt's solution would be my suggestion too.
An alternative might be to have a few selectable options like -march=native etc, like you do for optimization level.
Thanks!
On Mon, Aug 28, 2017 at 11:30 AM, Fred Tingaud [email protected] wrote:
Hi, I thought about it, but the problem with passing compiler options like that is that it would require some serious sanitizing, to avoid injections. I am not very clear on how I could do it. I will have a look at godbolt's solution and see if I can use it.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/FredTingaud/quick-bench-front-end/issues/3#issuecomment-325438548, or mute the thread https://github.com/notifications/unsubscribe-auth/AA0hpLgFb4pTSffAHADLyNPdHr9ezi-_ks5scwc0gaJpZM4PD3xZ .
Yes, being able to select architecture would help in benchmarking SIMD intrinsic code!
Hi, -march option would be extremely useful. I make a benchmarking with SIMD instructions and autovectorization
Hi, It is a highly desirable feature!
Hi, Quick Bench runs on AWS, that doesn't guarantee any architecture for the kind of machines the project can afford. Thus it would not possible to consistently run tests compiled with a given -march flags.
Well, like somebody in the quick-bench-back-end issue already pointed out, this is not completely true.
You do have certain guarantees: https://aws.amazon.com/ec2/instance-types/
Instance Type | CPU Architecture |
---|---|
T2 | Intel Xeon (but does not say which one) |
M4 | 2.3 GHz Intel Xeon E5-2686 v4 (Broadwell) or 2.4 GHz Intel Xeon E5-2676 v3 (Haswell) |
M5 | 2.5 GHz Intel Xeon Platinum 8175 with AXV-512 |
So the minimum guaranteed is Haswell which means from the hardware side you have support for MMX, AES-NI, CLMUL, FMA3, SSE, SSE2, SSE3, SSSE3, SSE4, SSE4.1, SSE4.2, AVX up to AVX2. Broadwell did not change that, only with Skylake would we get AVX-512.
They do have a footnote saying
"AVX, AVX2 are only available on instances launched with HVM AMIs."
so not sure what is up with that. You could include a runtime check for that. But either way up until SSE4.2 you are fine. This would really make quickbench so much more useful to me!
There is workaround, if you use __attribute__ ((target("fma"), optimize("-ffast-math")))
you can force GCC to change target processor per function and use SSE e.g.: http://quick-bench.com/SBnGN_2uHuwiFi5nIzekFVKK3ic
One important caveat is that test function need same attribute otherwise GCC will not inline it.
Awesome, thank you @Yankes - that seems to work.
here are links to the documentation of the target attribute: https://gcc.gnu.org/onlinedocs/gcc-8.2.0/gcc/x86-Function-Attributes.html#x86-Function-Attributes https://clang.llvm.org/docs/AttributeReference.html#target-gnu-target
and here are the most relevant values:
- sse3
- ssse3
- sse4
- sse4.1
- sse4.2
- sse4a
- fma4
- avx
Thanks for your comments! Right now, I am super busy and can't really experiment with this, but I'll try to check that next time I work on an update.
~Support for -Ofast
(and perhaps -Og
) among the optim
options would be valuable as well.~
Edit: Added in https://github.com/FredTingaud/quick-bench-front-end/commit/5bd1fe1bc8131b435cac0a88c53ce00ba851d29f -- thank you!
As far as I see the topic is pretty old, but still relevant. Is there any chance of changes? Thanks.
You could always check a preset list of compiler flags.
It would be very helpful to be able to benchmark intrinsics (e.g. -mbmi2
), otherwise we can't really test peak performance.