godot icon indicating copy to clipboard operation
godot copied to clipboard

`get_processor_count` returns incorrect number of threads or processors

Open Jan200101 opened this issue 2 years ago • 1 comments

Godot version

v3.5.1.stable.official [6fed1ffa3]

System information

Fedora 37, libstdc++ 12.2.1

Issue description

get_processor_count is suppose to return the "number of logical CPU cores available on the host machine".

My AMD Ryzen 5 5600X has 6 cores and 2 threads per core, so I would assume that get_processor_count returns 12, instead it returns 32.

This appears to be caused by std::thread::hardware_concurrency being unreliable and implementation specific.

This has been tried against the Fedora package as well as the Steam release and the tarball from the website and always got 32.

Steps to reproduce

  • call OS.get_processor_count
  • get unexpected return

Minimal reproduction project

N/A

Jan200101 avatar Jan 29 '23 13:01 Jan200101

Could you please paste the output of the commands lscpu and sudo lshw -C CPU?

RandomShaper avatar Jan 29 '23 16:01 RandomShaper

lscpu
Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Address sizes:                   48 bits physical, 48 bits virtual
Byte Order:                      Little Endian
CPU(s):                          12
On-line CPU(s) list:             0-11
Vendor ID:                       AuthenticAMD
Model name:                      AMD Ryzen 5 5600X 6-Core Processor
CPU family:                      25
Model:                           33
Thread(s) per core:              2
Core(s) per socket:              6
Socket(s):                       1
Stepping:                        0
Frequency boost:                 enabled
CPU(s) scaling MHz:              65%
CPU max MHz:                     4650.2920
CPU min MHz:                     2200.0000
BogoMIPS:                        7399.37
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm
Virtualization:                  AMD-V
L1d cache:                       192 KiB (6 instances)
L1i cache:                       192 KiB (6 instances)
L2 cache:                        3 MiB (6 instances)
L3 cache:                        32 MiB (1 instance)
NUMA node(s):                    1
NUMA node0 CPU(s):               0-11
Vulnerability Itlb multihit:     Not affected
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Not affected
Vulnerability Meltdown:          Not affected
Vulnerability Mmio stale data:   Not affected
Vulnerability Retbleed:          Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP always-on, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Not affected
sudo lshw -C CPU
  *-cpu
       description: CPU
       product: AMD Ryzen 5 5600X 6-Core Processor
       vendor: Advanced Micro Devices [AMD]
       physical id: 17
       bus info: cpu@0
       version: AMD Ryzen 5 5600X 6-Core Processor
       serial: Unknown
       slot: AM4
       size: 3719MHz
       capacity: 4650MHz
       width: 64 bits
       clock: 100MHz
       capabilities: lm fpu fpu_exception wp vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp x86-64 constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm cpufreq
       configuration: cores=6 enabledcores=6 threads=12

Jan200101 avatar Jan 29 '23 18:01 Jan200101

This appears to be caused by std::thread::hardware_concurrency being unreliable and implementation specific.

~~If this is the case, then this is a regression from https://github.com/godotengine/godot/pull/64815.~~ Edit: https://github.com/godotengine/godot/pull/64815 is not present in 3.5.1, only 3.x (3.6.dev) and master (4.0.beta).

PS: Does the C++ standard (hardware_concurrency()) even have a way to get the number of physical CPU cores? This is something that may be added as a separate method in the future.

Calinou avatar Jan 29 '23 22:01 Calinou

https://github.com/godotengine/godot/pull/64815 is not present in 3.5.1, only 3.x (3.6.dev) and master (4.0.beta).

You are correct, I mistook the code on master for code that was already in 3.5.

So the real problem is caused by sysconf(_SC_NPROCESSORS_CONF). I cannot recreate this on other machines than my Desktop, so it may very well be an edge case.

From what I can find _SC_NPROCESSORS_ONLN may work instead, and is one of the possible backings of hardware_concurrency.

I'll look into that when I can.

PS: Does the C++ standard (hardware_concurrency()) even have a way to get the number of physical CPU cores? This is something that may be added as a separate method in the future.

Does not appear to be that way.

The working draft of C++11 says:

unsigned hardware_concurrency() noexcept;

Returns: The number of hardware thread contexts. [ Note: This value should only be considered to be a hint. — end note ] If this value is not computable or well defined an implementation should return 0.

Jan200101 avatar Jan 30 '23 07:01 Jan200101

So the real problem is caused by sysconf(_SC_NPROCESSORS_CONF). I cannot recreate this on other machines than my Desktop, so it may very well be an edge case.

From what I can find _SC_NPROCESSORS_ONLN may work instead, and is one of the possible backings of hardware_concurrency.

Yep, that was it

sysconf(_SC_NPROCESSORS_CONF) = 32
sysconf(_SC_NPROCESSORS_ONLN) = 12

As far as I can find all backings of hardware_concurrency should return the same as _SC_NPROCESSORS_ONLN.

Would this be worth backporting to 3.5?

Jan200101 avatar Jan 31 '23 05:01 Jan200101