infer icon indicating copy to clipboard operation
infer copied to clipboard

Crashes because `physical_cores()` doesn't understand the number of cpu cores on PowerPC

Open zyh1121 opened this issue 3 years ago • 2 comments

Please make sure your issue is not addressed in the FAQ.

Please include the following information:

  • [x] The version of infer from infer --version. Any version after 8f13e6ecb3a8f26f5c81311e4b5909e1ab566530
  • [x] Your operating system and version, for example "Debian 9", "MacOS High Sierra", whether you are using Docker, etc. Ubuntu 18.04, RHEL7.6
  • [x] Which command you ran, for example infer -- make. infer run -- gcc -c hello_world.c
  • [x] The full output in a paste, for instance a gist.
$ ./infer --version
Infer version v1.1.0-32e438f66
Copyright 2009 - present Facebook. All Rights Reserved.

$ ./infer run -- gcc -c hello_world.c 
Capturing in make/cc mode...
Found 1 source file to analyze in ............../infer-out
Uncaught Internal Error: (Unix.Unix_error "No child processes" waitpid "((mode (WNOHANG)) (pid -1))")
Error backtrace:
Raised at Core__Core_unix.improve in file "src/core_unix.ml", line 46, characters 4-43
Called from Core__Core_unix.wait_gen in file "src/core_unix.ml", line 942, characters 4-246
Called from IBase__ProcessPool.has_dead_child in file "src/base/ProcessPool.ml", line 194, characters 2-23
Called from IBase__ProcessPool.process_updates in file "src/base/ProcessPool.ml", line 250, characters 2-21
Called from IBase__ProcessPool.run in file "src/base/ProcessPool.ml", line 462, characters 4-31
Called from IBase__ProcessPool.run in file "src/base/ProcessPool.ml", line 472, characters 16-24
Called from Backend__InferAnalyze.analyze in file "src/backend/InferAnalyze.ml", line 195, characters 24-47
Called from Backend__InferAnalyze.main in file "src/backend/InferAnalyze.ml", line 260, characters 42-62
Called from Integration__Driver.execute_analyze.(fun) in file "src/integration/Driver.ml", line 175, characters 2-34
Called from Backend__GCStats.log_f in file "src/backend/GCStats.ml", line 90, characters 10-14
Called from Integration__Driver.analyze_and_report in file "src/integration/Driver.ml", line 280, characters 6-36
Called from IBase__Utils.timeit in file "src/base/Utils.ml", line 424, characters 16-20
Called from IBase__ScubaLogging.execute_with_time_logging in file "src/base/ScubaLogging.ml", line 79, characters 29-44
Called from Dune__exe__Infer.run in file "src/infer.ml", line 21, characters 2-47
Called from IBase__Utils.timeit in file "src/base/Utils.ml", line 424, characters 16-20
Called from IBase__ScubaLogging.execute_with_time_logging in file "src/base/ScubaLogging.ml", line 79, characters 29-44
Called from Dune__exe__Infer in file "src/infer.ml", line 168, characters 6-52

Run the command again with `--keep-going` to try and ignore this error.
  • [x] If possible, a minimal example to reproduce your problem (for instance, some code where infer reports incorrectly, together with the way you run infer to reproduce the incorrect report). example from https://fbinfer.com/docs/hello-world/#hello-world-c

After https://github.com/facebook/infer/commit/8f13e6ecb3a8f26f5c81311e4b5909e1ab566530, I could build infer on PPC after patching the ocaml switch and clang target. However, infer always crashes and I didn't get many hints from the stack trace. After some diggings, it turns out physical_cores () in infer/src/base/Utils.ml doesn't get the number of cores correctly when /proc/cpuinfo uses a different format. The following are two examples of /proc/cpuinfo from two PPC machines, which don't have the physical id\\|core id pattern.

Instead of parsing /proc/cpuinfo, would getconf _NPROCESSORS_ONLN like this example https://stackoverflow.com/a/16273514 make sense?

  • RHEL7.6
processor       : 0
cpu             : POWER8 (raw), altivec supported
clock           : 3491.000000MHz
revision        : 2.0 (pvr 004d 0200)

processor       : 1
cpu             : POWER8 (raw), altivec supported
clock           : 3491.000000MHz
revision        : 2.0 (pvr 004d 0200)

...

processor       : 158
cpu             : POWER8 (raw), altivec supported
clock           : 3491.000000MHz
revision        : 2.0 (pvr 004d 0200)

processor       : 159
cpu             : POWER8 (raw), altivec supported
clock           : 3491.000000MHz
revision        : 2.0 (pvr 004d 0200)

timebase        : 512000000
platform        : PowerNV
model           : 8335-GTA
machine         : PowerNV 8335-GTA
firmware        : OPAL v3
  • Ubuntu 18.04
processor       : 0
cpu             : POWER9, altivec supported
clock           : 3783.000000MHz
revision        : 2.1 (pvr 004e 1201)

processor       : 1
cpu             : POWER9, altivec supported
clock           : 3783.000000MHz
revision        : 2.1 (pvr 004e 1201)

...

processor       : 158
cpu             : POWER9, altivec supported
clock           : 2300.000000MHz
revision        : 2.1 (pvr 004e 1201)

processor       : 159
cpu             : POWER9, altivec supported
clock           : 2300.000000MHz
revision        : 2.1 (pvr 004e 1201)

timebase        : 512000000
platform        : PowerNV
model           : 8335-GTG
machine         : PowerNV 8335-GTG
firmware        : OPAL
MMU             : Radix

zyh1121 avatar Aug 19 '21 16:08 zyh1121

Thank you for the report. getconf _NPROCESSORS_ONLN gives the number of logical cores, not physical ones. The reason we parse /proc/cpuinfo is to get the number of cores without things like hyperthreading. I agree it's very fragile...

jvillard avatar Aug 19 '21 16:08 jvillard

This looks promising for a fix: https://stackoverflow.com/a/23378780

jvillard avatar Aug 19 '21 16:08 jvillard