num_cpus icon indicating copy to clipboard operation
num_cpus copied to clipboard

[Q&A] Is there a way to get the number of performance cores on Apple Silicon?

Open Byron opened this issue 4 years ago • 5 comments

As the author of dua I noticed that on the M1 chip IO performance gets worse if the high-efficiency cores are taken into consideration when configuring thread pools.

Thus for now I hardcode the value for optimal performance knowing that it might break sometime later this year.

Do you think it's in scope to add such capability to num_cpus?

Byron avatar Jan 24 '21 01:01 Byron

Hm, so these cores are different than normal? Is there an API to query them? We'd need more info to answer the question properly, I think.

seanmonstar avatar Jan 25 '21 23:01 seanmonstar

I thought the best way for me to learn more is to look at sysinfo, which appears to use a sysctl API.

Looking at it, there isn't anything obvious showing the amount of high performance (or high efficiency) cores for that matter:

sysctl -a | rg cpu

kern.cpu_checkin_interval: 4000
hw.ncpu: 8
hw.activecpu: 8
hw.physicalcpu: 8
hw.physicalcpu_max: 8
hw.logicalcpu: 8
hw.logicalcpu_max: 8
hw.cputype: 16777228
hw.cpusubtype: 2
hw.cpu64bit_capable: 1
hw.cpufamily: 458787763
hw.cpusubfamily: 2
machdep.cpu.cores_per_package: 8
machdep.cpu.core_count: 8
machdep.cpu.logical_per_package: 8
machdep.cpu.thread_count: 8
machdep.cpu.brand_string: Apple M1

Another bunch of flags yields nothing very telling either: sysctl hw

sysctl hw
hw.ncpu: 8
hw.byteorder: 1234
hw.memsize: 8589934592
hw.activecpu: 8
hw.physicalcpu: 8
hw.physicalcpu_max: 8
hw.logicalcpu: 8
hw.logicalcpu_max: 8
hw.cputype: 16777228
hw.cpusubtype: 2
hw.cpu64bit_capable: 1
hw.cpufamily: 458787763
hw.cpusubfamily: 2
hw.cacheconfig: 8 1 1 0 0 0 0 0 0 0
hw.cachesize: 3613523968 65536 4194304 0 0 0 0 0 0 0
hw.pagesize: 16384
hw.pagesize32: 16384
hw.cachelinesize: 128
hw.l1icachesize: 131072
hw.l1dcachesize: 65536
hw.l2cachesize: 4194304
hw.tbfrequency: 24000000
hw.packages: 1
hw.osenvironment:
hw.ephemeral_storage: 0
hw.use_recovery_securityd: 0
hw.use_kernelmanagerd: 1
hw.serialdebugmode: 0
hw.optional.floatingpoint: 1
hw.optional.watchpoint: 4
hw.optional.breakpoint: 6
hw.optional.neon: 1
hw.optional.neon_hpfp: 1
hw.optional.neon_fp16: 1
hw.optional.armv8_1_atomics: 1
hw.optional.armv8_crc32: 1
hw.optional.armv8_2_fhm: 1
hw.optional.armv8_2_sha512: 1
hw.optional.armv8_2_sha3: 1
hw.optional.amx_version: 2
hw.optional.ucnormal_mem: 1
hw.optional.arm64: 1
hw.targettype: J313

There is only this one line showing hw.optional.watchpoint: 4, and it might take another Apple hardware release to know if that does indeed change with the amount of high performance or high-efficiency cores.

Byron avatar Jan 26 '21 12:01 Byron

To make it a little less specific, a point validly criticised in the linked sysinfo issue, here is the announced Intel Alder Lake CPU which introduces the concept of 'big' and 'small' cores.

Let's see what AMD will do.

Byron avatar Jan 31 '21 02:01 Byron

With ~~Big Sur~~ Monterey on an M1 Pro, I get this output now:

» sysctl -a | rg cpu


kern.cpu_checkin_interval: 4000
kern.sched_rt_avoid_cpu0: 0
hw.ncpu: 10
hw.activecpu: 10
hw.perflevel0.cpusperl2: 4
hw.perflevel0.logicalcpu: 8
hw.perflevel0.logicalcpu_max: 8
hw.perflevel0.physicalcpu: 8
hw.perflevel0.physicalcpu_max: 8
hw.perflevel1.cpusperl2: 2
hw.perflevel1.logicalcpu: 2
hw.perflevel1.logicalcpu_max: 2
hw.perflevel1.physicalcpu: 2
hw.perflevel1.physicalcpu_max: 2
hw.cpu64bit_capable: 1
hw.cpufamily: 458787763
hw.cpusubfamily: 4
hw.cpusubtype: 2
hw.cputype: 16777228
hw.logicalcpu: 10
hw.logicalcpu_max: 10
hw.physicalcpu: 10
hw.physicalcpu_max: 10
machdep.cpu.brand_string: Apple M1 Pro
machdep.cpu.core_count: 10
machdep.cpu.cores_per_package: 10
machdep.cpu.logical_per_package: 10
machdep.cpu.thread_count: 10

Should be able to get the values from hw.perflevel0 and hw.perflevel1

pedantic79 avatar Dec 02 '21 00:12 pedantic79

That's great, it looks like they have improved the output on Monterey, which the minimum OS for the M1 Pro CPUs.

Now it looks like the perflevelN key can be used to differentiate the core counts.

Byron avatar Dec 02 '21 01:12 Byron