py-spy
py-spy copied to clipboard
Error: failed to get os threadid
py-spy top --native --pid 229875 Error: failed to get os threadid py-spy 0.3.11
This means py-spy failed to get the native thread ID. This can happen due to numerous reasons depending on the OS you are using. On which system are you running py-spy?
In any case, the direct trigger for this error is --native
- if you remove this flag, this error shouldn't trigger; so you can try without it if you can go without native traces.
Hi, Jongy Thanks for your response. My OS information is as bellow Linux icx08 4.18.0-305.12.1.el8_4.x86_64 #1 SMP Wed Aug 11 01:59:55 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux. gcc version 8.5.0 20210514 (Red Hat 8.5.0-10) (GCC) Python-3.6.5 Linux distribution: CentOS Linux | 8 | libc version: glibc-2.28
I profile a running process running inside the docker container. If I remove the flag --native, it can go well, but I want to trace the native stack(C/C++ extension).
Ah, py-spy doesn't support getting the OS thread ID for dockerized processes. See _get_os_thread_id
impl for linux:
#[cfg(all(target_os="linux", unwind))]
fn _get_os_thread_id<I: InterpreterState>(&mut self, python_thread_id: u64, interp: &I) -> Result<Option<Tid>, Error> {
....
// likewise this doesn't yet work for profiling processes running inside docker containers from the host os
if self.dockerized {
return Ok(None);
}
I think that's the issue.
This is actually something we've been tackling but I don't have a solution ready yet.
Meanwhile - I can suggest that you run py-spy inside the container - that is, in the same PID NS.
For example, if the host PID is 229875
and the PID inside the container is 40
, and the container is named my_app
, then you can instead copy py-spy into the container (use the static musl build): docker cp ./py-spy my_app:/py-spy
then run it (note - privileged is required): docker exec -it --privileged /py-spy top --native --pid 40
. I think that'll work (at least, it will avoid the OS thread ID issue).
Ah, py-spy doesn't support getting the OS thread ID for dockerized processes. See
_get_os_thread_id
impl for linux:#[cfg(all(target_os="linux", unwind))] fn _get_os_thread_id<I: InterpreterState>(&mut self, python_thread_id: u64, interp: &I) -> Result<Option<Tid>, Error> { .... // likewise this doesn't yet work for profiling processes running inside docker containers from the host os if self.dockerized { return Ok(None); }
I think that's the issue.
This is actually something we've been tackling but I don't have a solution ready yet.
Meanwhile - I can suggest that you run py-spy inside the container - that is, in the same PID NS.
For example, if the host PID is
229875
and the PID inside the container is40
, and the container is namedmy_app
, then you can instead copy py-spy into the container (use the static musl build):docker cp ./py-spy my_app:/py-spy
then run it (note - privileged is required):docker exec -it --privileged /py-spy top --native --pid 40
. I think that'll work (at least, it will avoid the OS thread ID issue).
Thanks Jongy, Yes, It can run well, when I run py-spy inside the container.
Glad it helped :)
Fwiw, with python 3.11 we can get the OS thread id directly from python, and will be able to grab it from a dockerized process from the host container. We still won't be able to do native profiling from the host into the container though -
I also found the same error. https://github.com/ray-project/ray/issues/30566
But for our case, we run py-spy within a docker container, so I am not sure how we can debug this issue... any pointer to take a look?
I found when I don't specify this is returned
Thread 0x7FB1278F5740 (active): "MainThread"
main_loop (ray/_private/worker.py:763)
<module> (ray/_private/workers/default_worker.py:233)
Thread 860 (idle): "ray_import_thread"
wait (threading.py:300)
_wait_once (grpc/_common.py:106)
wait (grpc/_common.py:148)
result (grpc/_channel.py:735)
_poll_locked (ray/_private/gcs_pubsub.py:255)
poll (ray/_private/gcs_pubsub.py:391)
_run (ray/_private/import_thread.py:69)
run (threading.py:870)
_bootstrap_inner (threading.py:926)
_bootstrap (threading.py:890)
Thread 864 (idle): "AsyncIO Thread: default"
run (threading.py:870)
_bootstrap_inner (threading.py:926)
_bootstrap (threading.py:890)
Thread 866 (idle): "Thread-2"
run (threading.py:870)
_bootstrap_inner (threading.py:926)
_bootstrap (threading.py:890)
Thread 0x7F9F815EB700 (active)
Thread 39212 (idle): "Thread-19"
channel_spin (grpc/_channel.py:1258)
run (threading.py:870)
_bootstrap_inner (threading.py:926)
_bootstrap (threading.py:890)
Is this related to that we have a thread 0x7F9F815EB700
that doesn't seem to be a Python thread?
@rkooo567 that looks pretty odd to me - I'm unsure why py-spy managed to figure out the native threadid in some cases, but not others. Is there a way I can run this myself to investigate ? (docker container with python script to run etc).