Cross-container --native panics: called `Option::unwrap()` on a `None` value', src/libcore/option.rs:378:21
Using py-spy 0.3.2 release binary either musl or gnu panics with the following panic backtrace when adding the --native flag to either dump or record a PID running in a different container. Without --native works fine.
Works
./py-spy dump --pid=(OTHER CONTAINER)(correctly shows Python stacks)./py-spy dump --pid=(SAME CONTAINER) --native(correctly shows Python + native code stacks)gdb attach (OTHER CONTAINER)(shows
Crash
# RUST_BACKTRACE=full ./t/py-spy dump --pid=3511964 --native
Process 3511964: /venv2/bin/python2.7 -m staticbatch.worker --topic-name=gke-email-shadow-test --max-pull-messages=10 --max-workers=10 --sendgrid-sandbox-mode=true --use-datadog-client=true --use-recipe-rules-client=true --use-product-service=true --disable-mtbe=true
Python v2.7.16 (/venv2/bin/python2.7)
thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', src/libcore/option.rs:378:21
stack backtrace:
0: 0x555c589efa74 - backtrace::backtrace::libunwind::trace::h65597d255cb1398b
at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.40/src/backtrace/libunwind.rs:88
1: 0x555c589efa74 - backtrace::backtrace::trace_unsynchronized::hd4f479d7150ec4a0
at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.40/src/backtrace/mod.rs:66
2: 0x555c589efa74 - std::sys_common::backtrace::_print_fmt::h015072984a2b172c
at src/libstd/sys_common/backtrace.rs:77
3: 0x555c589efa74 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h6df05d3335f32194
at src/libstd/sys_common/backtrace.rs:61
4: 0x555c58a15f6c - core::fmt::write::h1f444f4312eb6c27
at src/libcore/fmt/mod.rs:1028
5: 0x555c589ece97 - std::io::Write::write_fmt::h8d147888220078ef
at src/libstd/io/mod.rs:1412
6: 0x555c589f1fee - std::sys_common::backtrace::_print::h8a6df0fa81d6af62
at src/libstd/sys_common/backtrace.rs:65
7: 0x555c589f1fee - std::sys_common::backtrace::print::h6f05b4733407e509
at src/libstd/sys_common/backtrace.rs:50
8: 0x555c589f1fee - std::panicking::default_hook::{{closure}}::h0d0a23bd02315dd8
at src/libstd/panicking.rs:188
9: 0x555c589f1ce1 - std::panicking::default_hook::h8d15a9aecb4efac6
at src/libstd/panicking.rs:205
10: 0x555c589f26eb - std::panicking::rust_panic_with_hook::hbe174577402a475d
at src/libstd/panicking.rs:464
11: 0x555c589f228e - std::panicking::continue_panic_fmt::h4d855dad868accf3
at src/libstd/panicking.rs:373
12: 0x555c589f2176 - rust_begin_unwind
at src/libstd/panicking.rs:302
13: 0x555c58a11d5e - core::panicking::panic_fmt::hdeb7979ab6591473
at src/libcore/panicking.rs:139
14: 0x555c58a11caa - core::panicking::panic::hb5daa85c7c72fc62
at src/libcore/panicking.rs:70
15: 0x555c587b6e4c - py_spy::python_spy::PythonSpy::get_stack_traces::hf1034ae8c547c3fe
16: 0x555c587a5ade - py_spy::dump::print_traces::ha3b7ff6b5d3dc615
17: 0x555c58774ce9 - py_spy::run_spy_command::h9e10a2ee06576172
18: 0x555c587781ae - py_spy::main::h65a100b9d2adcb81
19: 0x555c58795853 - std::rt::lang_start::{{closure}}::hc7c20c46ab3745aa
20: 0x555c589f2113 - std::rt::lang_start_internal::{{closure}}::h6ea535ec5c50fc3e
at src/libstd/rt.rs:48
21: 0x555c589f2113 - std::panicking::try::do_call::h631c6408dfccc6f5
at src/libstd/panicking.rs:287
22: 0x555c589fa01a - __rust_maybe_catch_panic
at src/libpanic_unwind/lib.rs:78
23: 0x555c589f2bcd - std::panicking::try::hab539b2d1255d635
at src/libstd/panicking.rs:265
24: 0x555c589f2bcd - std::panic::catch_unwind::hd5e0a26424bd7f34
at src/libstd/panic.rs:396
25: 0x555c589f2bcd - std::rt::lang_start_internal::h3bdc4c7d98181bf9
at src/libstd/rt.rs:47
26: 0x555c58779d72 - main
27: 0x7f831245d2e1 - __libc_start_main
28: 0x555c5870f3c9 - _start
29: 0x0 - <unknown>
Version Details
$ ./py-spy --version
py-spy 0.3.2
I'm running this on Google "Container OS" instance (running on a Google Kubernetes Engine cluster). My py-spy process is in a privileged container (the "toolbox" environment), and I'm trying to dump an unprivileged process in another container. These containers are almost certainly using different library versions etc, which I know sometimes causes debugging problems.
gdb output
GDB manages to do the right thing but prints warnings:
warning: Expected absolute pathname for libpthread in the inferior, but got target:/lib/x86_64-linux-gnu/libpthread.so.0.
warning: Unable to find libthread_db matching inferior's thread library, thread debugging will not be available.
warning: Target and debugger are in different PID namespaces; thread lists and other data are likely unreliable. Connect to gdbserver inside the container.
warning: Expected absolute pathname for libpthread in the inferior, but got target:/lib/x86_64-linux-gnu/libpthread.so.0.
warning: Unable to find libthread_db matching inferior's thread library, thread debugging will not be available.
I'm encountering a very similar error when trying to profile subprocesses. I'm running on Fedora, without any sorts of containers.
The command I used to run py-spy is RUST_BACKTRACE=full py-spy record -o profile -f speedscope -s -- python -m <my module>. The error does not occur when I omit the -s flag, or when I simply do not spawn any subprocesses in my Python program. The error happened for each of the 0.3.x versions. I'm using Python 3.8.1 and joblib with its default loky backend to spawn subprocesses.
thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', src/libcore/option.rs:378:21
stack backtrace:
0: 0x6f2324 - backtrace::backtrace::libunwind::trace::h0d3b97a6b64193be
at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.40/src/backtrace/libunwind.rs:88
1: 0x6f2324 - backtrace::backtrace::trace_unsynchronized::hb78a971d8547111e
at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.40/src/backtrace/mod.rs:66
2: 0x6f2324 - std::sys_common::backtrace::_print_fmt::ha1f41287e6ef2374
at src/libstd/sys_common/backtrace.rs:77
3: 0x6f2324 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h1130808f9197ccb5
at src/libstd/sys_common/backtrace.rs:61
4: 0x72542c - core::fmt::write::h4fa8b44d73117031
at src/libcore/fmt/mod.rs:1028
5: 0x6ef747 - std::io::Write::write_fmt::hc08a67176e074177
at src/libstd/io/mod.rs:1412
6: 0x6f489e - std::sys_common::backtrace::_print::h33a300037b4dc1ed
at src/libstd/sys_common/backtrace.rs:65
7: 0x6f489e - std::sys_common::backtrace::print::h2ceebd24f74a6808
at src/libstd/sys_common/backtrace.rs:50
8: 0x6f489e - std::panicking::default_hook::{{closure}}::h599b8688602f475a
at src/libstd/panicking.rs:188
9: 0x6f4591 - std::panicking::default_hook::h0085905ac97017e0
at src/libstd/panicking.rs:205
10: 0x6f4f9b - std::panicking::rust_panic_with_hook::h780f8e9bbe4fc091
at src/libstd/panicking.rs:464
11: 0x6f4b3e - std::panicking::continue_panic_fmt::h1444a364e5f24a1a
at src/libstd/panicking.rs:373
12: 0x6f4a26 - rust_begin_unwind
at src/libstd/panicking.rs:302
13: 0x72121e - core::panicking::panic_fmt::h760360c16b67cc83
at src/libcore/panicking.rs:139
14: 0x72116a - core::panicking::panic::ha556f2c3753f3759
at src/libcore/panicking.rs:70
15: 0x4694d6 - py_spy::run_spy_command::he9e192defef1e373
16: 0x46a82f - py_spy::main::he2a48b4fd3934d5e
17: 0x487343 - std::rt::lang_start::{{closure}}::h47dbd41d95b3d679
18: 0x6f49c3 - std::rt::lang_start_internal::{{closure}}::hdb12a3ead60a8960
at src/libstd/rt.rs:48
19: 0x6f49c3 - std::panicking::try::do_call::h75828266d21559c3
at src/libstd/panicking.rs:287
20: 0x6fba4a - __rust_maybe_catch_panic
at src/libpanic_unwind/lib.rs:78
21: 0x6f531b - std::panicking::try::h2533a62d10a8f001
at src/libstd/panicking.rs:265
22: 0x6f531b - std::panic::catch_unwind::hd9ecf47e7cd9602d
at src/libstd/panic.rs:396
23: 0x6f531b - std::rt::lang_start_internal::h93ee4050c8419278
at src/libstd/rt.rs:47
24: 0x46b7f2 - main
[2020-01-18T14:54:32.595985592Z ERROR py_spy::sampler] Failed to get process tree: poisoned lock: another task failed inside
[2020-01-18T14:54:32.596018311Z ERROR py_spy::sampler] Failed to get process tree: poisoned lock: another task failed inside
[2020-01-18T14:54:32.596020850Z ERROR py_spy::sampler] Failed to get process tree: poisoned lock: another task failed inside
[2020-01-18T14:54:32.596023870Z ERROR py_spy::sampler] Failed to get process tree: poisoned lock: another task failed inside
[2020-01-18T14:54:32.596029040Z ERROR py_spy::sampler] Failed to get process tree: poisoned lock: another task failed inside
[2020-01-18T14:54:32.596035250Z ERROR py_spy::sampler] Failed to get process tree: poisoned lock: another task failed inside
[2020-01-18T14:54:32.596040449Z ERROR py_spy::sampler] Failed to get process tree: poisoned lock: another task failed inside
[2020-01-18T14:54:32.596048259Z ERROR py_spy::sampler] Failed to get process tree: poisoned lock: another task failed inside
[2020-01-18T14:54:32.596053479Z ERROR py_spy::sampler] Failed to get process tree: poisoned lock: another task failed inside
[2020-01-18T14:54:32.596059928Z ERROR py_spy::sampler] Failed to get process tree: poisoned lock: another task failed inside
[2020-01-18T14:54:32.596065998Z ERROR py_spy::sampler] Failed to get process tree: poisoned lock: another task failed inside
[2020-01-18T14:54:32.596071658Z ERROR py_spy::sampler] Failed to get process tree: poisoned lock: another task failed inside
[2020-01-18T14:54:32.596073807Z ERROR py_spy::sampler] Failed to get process tree: poisoned lock: another task failed inside
[2020-01-18T14:54:32.596079167Z ERROR py_spy::sampler] Failed to get process tree: poisoned lock: another task failed inside
[2020-01-18T14:54:32.596084807Z ERROR py_spy::sampler] Failed to acquire lock: poisoned lock: another task failed inside
[2020-01-18T14:54:32.596101556Z ERROR py_spy::sampler] Failed to get process tree: poisoned lock: another task failed inside
[2020-01-18T14:54:32.596104356Z ERROR py_spy::sampler] Failed to get process tree: poisoned lock: another task failed inside
[2020-01-18T14:54:32.596106575Z ERROR py_spy::sampler] Failed to get process tree: poisoned lock: another task failed inside
[2020-01-18T14:54:32.596108865Z ERROR py_spy::sampler] Failed to get process tree: poisoned lock: another task failed inside
[2020-01-18T14:54:32.596112185Z ERROR py_spy::sampler] Failed to get process tree: poisoned lock: another task failed inside
[2020-01-18T14:54:32.618826380Z ERROR py_spy::sampler] Failed to get process tree: poisoned lock: another task failed inside
[2020-01-18T14:54:32.628870934Z ERROR py_spy::sampler] Failed to get process tree: poisoned lock: another task failed inside
Same thing with subprocesses! :(
We're using libunwind-ptrace to get the native component of the stack trace - and it doesn't look like its working across containers =( In the short term I'll add some code to warn that cross-container native profiling isn't supported yet - but to fix properly will probably involve diving into the libunwind code and will take some time
Hi @benfred, just curious if there are any updates regarding cross-container native profiling? I also run into Failed to get process tree: poisoned lock: another task failed inside error with the --subprocesses and --native flags. Thank you!
👋 Hello! Is this still an active issue?