py-spy
py-spy copied to clipboard
Errors sampling process using os.exec*
Following my comment on #243, I explained how to reproduce the Error: No child process (os error 10)
issue. I digged a bit deeper and realized most of the samples from my traces were unusable because a large amount of samples disappear.
When using py-spy record
over the programs I am profiling, I encountered this issue with basically every SCT python script invokation and the profiler outputted something like this at the end:
py-spy> Wrote flamegraph data to 'test.svg'. Samples: 15 Errors: 629
Obviously, most of the samples are lost in these cases. However, I never encountered this when running py-spy record
directly over a .py file. These bad traces only happen when I am profiling a script that takes advantage of Python's console-scripts
entry point, in the setup.py
file.
Using RUST_LOG=warn
:
» RUST_LOG=warn py-spy record -o test.svg -- sct_check_dependencies -short 1 ↵ herman@tank
[2021-06-09T22:05:48.102470115Z WARN py_spy::python_spy] Interpreter address from interp_head symbol is invalid 000055f771de26c0
py-spy> Sampling process 100 times a second. Press Control-C to exit.
[2021-06-09T22:05:48.253545145Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
[2021-06-09T22:05:48.254633903Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
[2021-06-09T22:05:48.258845533Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
[2021-06-09T22:05:48.272319643Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
[2021-06-09T22:05:48.272779210Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
[2021-06-09T22:05:48.278885932Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
[2021-06-09T22:05:48.309055502Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
[2021-06-09T22:05:48.333192050Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
[2021-06-09T22:05:48.341197105Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
[2021-06-09T22:05:48.361254618Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
[2021-06-09T22:05:48.364358560Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
[2021-06-09T22:05:48.385848214Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
[2021-06-09T22:05:48.389786467Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
[2021-06-09T22:05:48.404397293Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
[2021-06-09T22:05:48.409291473Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
[2021-06-09T22:05:48.409513969Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
[2021-06-09T22:05:48.430549140Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
[2021-06-09T22:05:48.449357225Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
[2021-06-09T22:05:48.449765453Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
[2021-06-09T22:05:48.453872970Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
[2021-06-09T22:05:48.470404562Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
[2021-06-09T22:05:48.472034581Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
[2021-06-09T22:05:48.472985324Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
[2021-06-09T22:05:48.488131601Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
[2021-06-09T22:05:48.488350670Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
[2021-06-09T22:05:48.494125840Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
--
Spinal Cord Toolbox (git-master-a685d7f0d8032f9df685a08d22947947ac416e71*)
sct_check_dependencies -short
--
[2021-06-09T22:05:48.501538400Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
[2021-06-09T22:05:48.502834813Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
SCT info:
- version: git-master-a685d7f0d8032f9df685a08d22947947ac416e71*
- path: /home/herman/Documents/NEUROPOLY_21/spinalcordtoolbox
[2021-06-09T22:05:48.523265225Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
OS: linux (Linux-5.4.106-rt54-MANJARO-x86_64-with-arch-Manjaro-Linux)
CPU cores: Available: 4, Used by ITK functions: 4
RAM: Total: 15895MB, Used: 5543MB, Available: 5143MB
[2021-06-09T22:05:48.526914090Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
[2021-06-09T22:05:48.528860746Z WARN py_spy] Failed to get stack trace from 269800: Bad address (os error 14)
py-spy> Stopped sampling because process exitted
py-spy> Wrote flamegraph data to 'test.svg'. Samples: 2 Errors: 31
I think this is what's happening here: the scripts I am profiling are executed in the following way:
- I call the CLI command, e.g.
tool_from_a_toolbox
- The entry point for the
tool_from_a_toolbox
script is alauncher.py
file common to all tools from this particular toolbox. - This
launcher.py
script calls os.execvpe(), which takes as input the program you want to execute, its arguments and finally the environment parameter. This function executes a new program and replaces the current process without returning. From os.execvpe doc:
For execle(), execlpe(), execve(), and execvpe() (note that these all end in “e”), the env parameter must be a mapping which is used to define the environment variables for the new process (these are used instead of the current process’ environment); the functions execl(), execlp(), execv(), and execvp() all cause the new process to inherit the environment of the current process.
Maybe py-spy has some trouble attaching to this new process? I should mention the same issue happens when also tracking subprocesses.
Updating this issue for anyone who would encounter the same problem (trying to profile a python software built with a single launcher for every entry points).
The way to solve this is straightforward. Because py-spy record can't attach to the new process generated by os.execvpe()
(or any os.exec()
variants), we can directly insert the profiler invokation within the function arguments, e.g. if we had the following:
os.execvpe(cmd[0], cmd[0:], env)
simply do this:
cmd = [PATH_TO_PY_SPY] + PY_SPY_ARGS.split() + cmd
os.execvpe(cmd[0], cmd[0:], env)
This works like a charm. On my end this is solved.
This looks like a bug - I think the problem is that we're getting python interpreter information from the python process when py-spy starts (including addresses etc) and those all change after the os.exec call.
I can replicate it with a script like
import os
import sys
import time
if __name__ == "__main__":
# give py-spy a chance to connect
time.sleep(1)
# replace this process with another python interpreter
long_sleep_path = os.path.join(os.path.realpath(os.path.dirname(__file__)), "longsleep.py")
os.execvp(sys.executable, [sys.executable, long_sleep_path])
(in the tests directory).
Another workaround is probably to launch the process without py-spy, and then connect to the PID once the python process is started (which isn't ideal).
@benfred FWI apparently, on POSIX (but not on Windows), the constructor for the subprocess module in Python uses os.execvp()-like behavior (reference), so this issue might be more common than I initially thought.