cpuid-dump should iterate through all CPUs
(This PR also includes https://github.com/pytorch/cpuinfo/pull/349, but I can factor that out if preferred.)
As Intel puts it: "CPUID, by design, returns different values depending on the core it is executed on" (see https://www.intel.com/content/www/us/en/developer/articles/guide/12th-gen-intel-core-processor-gamedev-guide.html#inpage-nav-1-5-2:~:text=CPUID%2C%20by%20design%2C%20returns%20different%20values%20depending%20on%20the%20core%20it%20is%20executed%20on)
In particular, leaves 1, 4, 0x0b, 0x1a, and 0x1f are known to vary by core. Leaf 0x1a differentiates core types on hybrid CPUs
In order to aid in exploration of CPUID contents, cpuid-dump should dump CPUID results from all CPUs, rather than just one. This is currently implemented for Linux only, using the sched_setaffinity(2) system call.
The results can be easily deduplicated by piping through sort and uniq. For example:
./cpuid-dump | sort -sk6 -sk3 -sk1 | uniq -s4
cpu0: CPUID 00000000: 00000023-756E6547-6C65746E-49656E69 [GenuineIntel]
…
# High byte of EBX identifies each core's ID
# |
cpu0: CPUID 00000001: 000B06D1-00800800-FFFAF38B-BFCBFBFF
cpu1: CPUID 00000001: 000B06D1-08800800-FFFAF38B-BFCBFBFF
cpu2: CPUID 00000001: 000B06D1-10800800-FFFAF38B-BFCBFBFF
cpu3: CPUID 00000001: 000B06D1-18800800-FFFAF38B-BFCBFBFF
cpu4: CPUID 00000001: 000B06D1-40800800-FFFAF38B-BFCBFBFF
cpu5: CPUID 00000001: 000B06D1-42800800-FFFAF38B-BFCBFBFF
cpu6: CPUID 00000001: 000B06D1-44800800-FFFAF38B-BFCBFBFF
cpu7: CPUID 00000001: 000B06D1-46800800-FFFAF38B-BFCBFBFF
…
# High bytes of EAX indicates that cpu[0-3] are P-cores, and cpu[4-7] are E-cores
# The rest of EAX contains the "native model ID" (https://codebrowser.dev/linux/linux/arch/x86/include/asm/intel-family.h.html#intel_native_id)
cpu0: CPUID 0000001A: 40000003-00000000-00000000-00000000
cpu4: CPUID 0000001A: 20000003-00000000-00000000-00000000