py-spy
py-spy copied to clipboard
Idle detection discrepancies between OSX and Linux
I noticed when running py-spy (v0.3.3) against some production code on a Linux machine that around 60% of the time spent was attributed to about half a dozen or so lines in the code that were time.sleep() calls. I wrote the following minimal test case to reproduce the issue and discovered a discrepancy between what py-spy produces on Mac vs Linux:
(test.py)
import threading
import time
import numpy as np
class Runner:
def __init__(self):
self.is_running = True
self.threads = []
for k in range(10):
t = threading.Thread(target=self.do_work, name=str(k), args=(5.0, ))
self.threads.append(t)
t.start()
# time.sleep(0.5)
def do_work(self, delta_t):
last = time.perf_counter()
print('starting thread: ', threading.current_thread().name)
while self.is_running:
if (time.perf_counter() - last) > delta_t:
last = time.perf_counter()
data = np.random.random((1000, 100))
_ = np.linalg.svd(data, full_matrices=False)
else:
diff = delta_t - (time.perf_counter() - last)
if diff > 0:
time.sleep(diff)
def stop(self):
self.is_running = False
print('stopping threads')
for t in self.threads:
t.join()
print('threads stopped')
def run():
start_time = time.perf_counter()
r = Runner()
while time.perf_counter() - start_time < 100:
time.sleep(1.0)
r.stop()
if __name__ == '__main__':
run()
That starts a bunch of threads and at an interval delta_t runs a bit of work, where the time required to complete the work is much less than delta_t.
I then run:
$ python test.py &
# get the pid
$ $ sudo env "PATH=$PATH" py-spy record -f raw -o profile_pyspy.raw -r 200 -p <pid>
# ctrl-c after maybe 30 or so seconds and then
$ cat profile_pyspy.raw | ./flamegraph.pl --minwidth=0.02 > profile_pyspy_mac.svg
On my mac, I get largely what I expect, that is the flamegraph is dominated by the calls to numpy in the do_work function. But running the same code on linux, almost 87% of the time spent, according to the flamegraph, is spent on the call time.sleep(diff) (line 29 in do_work). On the Mac, it attributes just under 6% of the time spent to the same line. I see basically the same if py-spy outputs svg directly instead of raw -> flamegraph -> svg.
If I use the --idle flag, then py-spy captures almost 100% of the time being spent in the time.sleep() on both systems.
I've posted a gist with the test code and the svg files at: https://gist.github.com/synapticarbors/38e7fda1d5c23fc80b8562d41b7b672d
I'm aware of some of the previous issues dealing with this (e.g. https://github.com/benfred/py-spy/issues/92), but it seems like this may still be an issue.
It's a been a while since I've posted this, so I just wanted to see if anyone had any thoughts about this issue?
Thanks for your patience on this -
It seems like your test program is almost always idle (as shown by running the --idle option). This means that any false positives in detecting idle status will show up pretty prominently, even if as a percentage of overall samples it's pretty rare. I
The idle detection isn't perfect here - and I've documented some of the issues here https://github.com/benfred/py-spy#how-do-you-detect-if-a-thread-is-idle-or-not for a discussion of the causes. In particular I think you are seeing this issue
Finally, on some Linux calls the ptrace attach that we are using may cause idle threads to wake up momentarily, causing false positives when reading from procfs.
I think a simpler test program would probably just be
import time
time.sleep(1000)
and you'd see roughly similar results. The majority of samples will be discarded - but not all of them on linux, as the code we add to pause the program sends a signal which wakes up the 'sleep' call.
I think this is exacerbated by the irregular sampling feature we added, which I'm considered turning off by default.