cpu_features
cpu_features copied to clipboard
Similarities to other CPU feature projects (deduplicate?)
ART on Android works around quirks/bugs in /proc/cpuinfo and aux vector, has support for cross compilation, detection of features via C preprocessor, CPUID instructions, undefined instruction exceptions, supports MIPS, ARM, Intel 32 and 64, etc.: https://android.googlesource.com/platform/art/+/master/runtime/arch/instruction_set_features.h#36 Android NDK has something simple: https://developer.android.com/ndk/guides/cpu-features.html V8 CPU feature detection: https://github.com/v8/v8/blob/master/src/base/cpu.h#L32
As there are OS/CPU bugs that cause issues standardizing on 1 library makes sense. As an author of the ART code I'm have a bias :-)
Thx for reaching out @captain5050 !
cpu_features is literally born from the fact that many libraries are coding their own routine (libwebp, libyuv, android NDK, OpenCV to name a few). It seems simple enough at first sight but you learn along the way how hard it is to get right.
This library has been designed with the author of the NDK's one you're referring to, the end goal was to replace it entirely with this library. It supports all of what you described for ART except the "undefined instruction exceptions" (additions welcome!). We also have a list of fixes for corrupted kernels and fallback for auxval via getauxval, /proc/self/auxval, /proc/cpuinfo.
Last but not least, this library is usable outside the Android ecosystem.
cpuid2cpuflags, although built for a specific use case, shares some similarities as well.
for x86, I can throw my hat into the ring.
From my benchmarks, I saw that I am a tick faster than cpu_features when detecting sse4_1 (7x):
Run on (4 X 3600 MHz CPU s)
2018-05-14 17:37:29
***WARNING*** CPU scaling is enabled, the benchmark real time ...
--------------------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------------------
BM_compass_sse4_1 31 ns 31 ns 22705074
BM_cpu_features_sse4_1 242 ns 241 ns 2870098
@psteinb I suppose you were talking about https://github.com/psteinb/compass right, not about https://github.com/psteinb/deeprace?
Can you compare the implementations to understand where "tick faster" comes from?