python function tracing for both python2 and python3
This is a new PR based on #680, but rebased onto master with several conflict fixes.
@rls1004 Please take a look at it if you're still interested in.
Here are some tests.
$ cat p-abc.py
#!/usr/bin/env python2
import os
def c():
return os.getpid()
def b():
return c()
def a():
return b()
print(a())
$ chmod +x p-abc.py
$ uftrace --python --no-libcall p-abc.py
68270
# DURATION TID FUNCTION
[ 68270] | <module:__main__>() {
[ 68270] | a() {
[ 68270] | b() {
2.610 us [ 68270] | c();
6.717 us [ 68270] | } /* b */
9.997 us [ 68270] | } /* a */
46.204 us [ 68270] | } /* <module:__main__> */
10.053 us [ 68270] | _remove();
4.790 us [ 68270] | _remove();
Another test with fork.
$ cat p-fork.py
#!/usr/bin/env python
import os
def c():
return os.getpid() % 100000
def b():
return c() + 1
def a():
return b() - 1
def main():
ret = 0
pid = os.fork()
if pid:
os.wait()
ret += a()
return ret
if __name__=='__main__':
main()
$ chmod +x p-fork.py
$ uftrace --python --no-libcall p-fork.py
# DURATION TID FUNCTION
[ 68311] | <module:__main__>() {
[ 68311] | main() {
[ 68311] | /* linux:sched-out */
[ 68322] | a() {
[ 68322] | b() {
37.690 us [ 68322] | c();
52.883 us [ 68322] | } /* b */
79.257 us [ 68322] | } /* a */
[ 68322] | } /* main */
[ 68322] | } /* <module:__main__> */
9.980 us [ 68322] | _remove();
4.326 us [ 68322] | _remove();
6.881 ms [ 68311] | /* linux:sched-in */
[ 68311] | a() {
[ 68311] | b() {
8.196 us [ 68311] | c();
15.087 us [ 68311] | } /* b */
24.763 us [ 68311] | } /* a */
7.549 ms [ 68311] | } /* main */
7.569 ms [ 68311] | } /* <module:__main__> */
10.060 us [ 68311] | _remove();
4.130 us [ 68311] | _remove();
However, it's a bit weird that python3 shows some function calls ahead, which begins with _find_and_load().
$ cat p-abc3.py
#!/usr/bin/env python3
import os
def c():
return os.getpid()
def b():
return c()
def a():
return b()
print(a())
$ chmod +x p-abc3.py
$ uftrace --python --no-libcall -D 5 p-abc3.py
68399
# DURATION TID FUNCTION
[ 68399] | _find_and_load() {
4.663 us [ 68399] | __init__();
[ 68399] | __enter__() {
[ 68399] | _get_module_lock() {
8.027 us [ 68399] | __init__();
42.120 us [ 68399] | } /* _get_module_lock */
15.064 us [ 68399] | acquire();
68.584 us [ 68399] | } /* __enter__ */
[ 68399] | _find_and_load_unlocked() {
[ 68399] | _find_spec() {
2.193 us [ 68399] | __enter__();
7.750 us [ 68399] | find_spec();
2.777 us [ 68399] | __exit__();
1.766 us [ 68399] | __enter__();
4.070 us [ 68399] | find_spec();
2.264 us [ 68399] | __exit__();
1.667 us [ 68399] | __enter__();
[ 68399] | find_spec() {
856.770 us [ 68399] | _get_spec();
874.013 us [ 68399] | } /* find_spec */
3.190 us [ 68399] | __exit__();
990.430 us [ 68399] | } /* _find_spec */
[ 68399] | _load_unlocked() {
[ 68399] | module_from_spec() {
1.850 us [ 68399] | create_module();
4.090 us [ 68399] | _new_module();
171.060 us [ 68399] | _init_module_attrs();
196.000 us [ 68399] | } /* module_from_spec */
[ 68399] | exec_module() {
652.530 us [ 68399] | get_code();
534.144 us [ 68399] | _call_with_frames_removed();
1.205 ms [ 68399] | } /* exec_module */
2.983 us [ 68399] | _verbose_message();
1.436 ms [ 68399] | } /* _load_unlocked */
2.459 ms [ 68399] | } /* _find_and_load_unlocked */
[ 68399] | __exit__() {
9.910 us [ 68399] | release();
15.010 us [ 68399] | } /* __exit__ */
5.803 us [ 68399] | cb();
2.591 ms [ 68399] | } /* _find_and_load */
11.496 us [ 68399] | getpreferredencoding();
[ 68399] | __init__() {
2.343 us [ 68399] | __init__();
8.663 us [ 68399] | } /* __init__ */
13.974 us [ 68399] | decode();
[ 68399] | <module:__main__>() {
[ 68399] | a() {
[ 68399] | b() {
5.494 us [ 68399] | c();
10.254 us [ 68399] | } /* b */
14.280 us [ 68399] | } /* a */
57.117 us [ 68399] | } /* <module:__main__> */
I will take a look at it when I have some time.
@namhyung Oops. Sorry that I pushed this work into your repository. My repository setting had some problems. Would it be better to close and open a new PR in my repository?
The python3 tracing shows different result in another machine.
$ uftrace --python --no-libcall p-abc3.py
11872
# DURATION TID FUNCTION
14.206 us [ 11872] | getpreferredencoding();
[ 11872] | __init__() {
2.947 us [ 11872] | __init__();
11.820 us [ 11872] | } /* __init__ */
17.623 us [ 11872] | decode();
[ 11872] | <module:__main__>() {
[ 11872] | a() {
[ 11872] | b() {
5.407 us [ 11872] | c();
10.047 us [ 11872] | } /* b */
14.130 us [ 11872] | } /* a */
73.452 us [ 11872] | } /* <module:__main__> */
5.713 us [ 11872] | _ag();
$ python3 --version
Python 3.6.8
@namhyung Oops. Sorry that I pushed this work into your repository. My repository setting had some problems. Would it be better to close and open a new PR in my repository?
Nop, it's ok, just leave it :)
I want to have a single module to cover both python v2 or v3. Then I think we should finish #897 first.
A similar log related to _find_and_load() appears when importing a module using importlib.
p-abc3.py :
#!/usr/bin/env python3
import importlib
os = importlib.import_module("os")
def c():
return os.getpid()
def b():
return c()
def a():
return b()
print(a())
result :
$ ./uftrace --python --no-libc ../test/p-abc3.py
14098
# DURATION TID FUNCTION
4.397 us [ 14098] | getpreferredencoding();
[ 14098] | __init__() {
0.685 us [ 14098] | __init__();
3.236 us [ 14098] | } /* __init__ */
4.202 us [ 14098] | decode();
[ 14098] | <module:__main__>() {
[ 14098] | import_module() {
[ 14098] | _gcd_import() {
1.676 us [ 14098] | _sanity_check();
[ 14098] | _find_and_load() {
0.970 us [ 14098] | __init__();
[ 14098] | __enter__() {
[ 14098] | _get_module_lock() {
2.333 us [ 14098] | __init__();
10.168 us [ 14098] | } /* _get_module_lock */
4.209 us [ 14098] | acquire();
17.157 us [ 14098] | } /* __enter__ */
[ 14098] | __exit__() {
2.791 us [ 14098] | release();
4.134 us [ 14098] | } /* __exit__ */
1.644 us [ 14098] | cb();
[ 14098] | _lock_unlock_module() {
[ 14098] | _get_module_lock() {
1.676 us [ 14098] | __init__();
7.506 us [ 14098] | } /* _get_module_lock */
2.683 us [ 14098] | acquire();
2.258 us [ 14098] | release();
15.392 us [ 14098] | } /* _lock_unlock_module */
1.370 us [ 14098] | cb();
50.181 us [ 14098] | } /* _find_and_load */
55.201 us [ 14098] | } /* _gcd_import */
58.700 us [ 14098] | } /* import_module */
[ 14098] | a() {
[ 14098] | b() {
1.601 us [ 14098] | c();
2.728 us [ 14098] | } /* b */
3.880 us [ 14098] | } /* a */
246.309 us [ 14098] | } /* <module:__main__> */
1.389 us [ 14098] | _ag();
p-abc3.py for python 2 :
$ ./uftrace --python --no-libc ../test/p-abc3.py
14193
# DURATION TID FUNCTION
[ 14193] | <module:__main__>() {
35.421 us [ 14193] | <module:importlib>();
4.749 us [ 14193] | import_module();
[ 14193] | a() {
[ 14193] | b() {
6.920 us [ 14193] | c();
8.125 us [ 14193] | } /* b */
9.174 us [ 14193] | } /* a */
2.903 ms [ 14193] | } /* <module:__main__> */
4.546 us [ 14193] | _remove();
1.435 us [ 14193] | _remove();
@rls1004 Thanks for testing. I just found that my system indirectly calls python3 as follows:
$ which python3
/home/honggyu/src/depot_tools/git/depot_tools/python3
It's actually a shell script and its content is as follows:
$ cat /home/honggyu/src/depot_tools/git/depot_tools/python3
#!/usr/bin/env bash
base_dir=$(dirname "$0")
PYTHON3_BIN_RELDIR="$(cat $base_dir/python3_bin_reldir.txt | xargs echo)"
PATH="bootstrap-3.8.0b1.chromium.1_bin/python3/bin":"bootstrap-3.8.0b1.chromium.1_bin/python3/bin/Scripts":"$PATH"
"$base_dir/bootstrap-3.8.0b1.chromium.1_bin/python3/bin/python3" "$@"
That's the reason of the difference. I tested it again by specifying the shebang explicitly to /usr/local/bin/python3 instead of /usr/bin/env python3 as follows:
$ cat p-abc3-2.py
#!/usr/local/bin/python3
import os
def c():
return os.getpid()
def b():
return c()
def a():
return b()
print(a())
The result is same as expected without long trace output inside _find_and_load.
$ uftrace --python --no-libcall p-abc3-2.py
21656
# DURATION TID FUNCTION
15.450 us [ 21656] | getpreferredencoding();
[ 21656] | __init__() {
2.947 us [ 21656] | __init__();
10.163 us [ 21656] | } /* __init__ */
15.436 us [ 21656] | decode();
[ 21656] | <module:__main__>() {
[ 21656] | a() {
[ 21656] | b() {
2.883 us [ 21656] | c();
7.733 us [ 21656] | } /* b */
12.054 us [ 21656] | } /* a */
58.123 us [ 21656] | } /* <module:__main__> */
Here is another test with uftrace/tests/runtest.py.
$ ./runtest.py 001 -p -O0
Start 1 tests with 1 worker
Test case pg
------------------------: O0
001 basic : OK
runtime test stats
====================
total 1 Tests executed (success: 100.00%)
OK: 1 Test succeeded
OK: 0 Test succeeded (with some fixup)
NG: 0 Different test result
NZ: 0 Non-zero return value
SG: 0 Abnormal exit by signal
TM: 0 Test ran too long
BI: 0 Build failed
LA: 0 Unsupported Language
SK: 0 Skipped
If uftrace tries to record the execution, it gets stuck and never finish as follows:
$ uftrace record --python --no-libcall ./runtest.py 001 -p -O0
Start 1 tests with 1 worker
Test case pg
------------------------: O0
(... got stuck here ...)
The remaining record looks fine and need to find the reason of getting stuck here.
Here is replay output snippet.
# DURATION TID FUNCTION
[ 23175] | <module:__main__>() {
[ 23175] | <module:random>() {
[ 23175] | <module:__future__>() {
4.657 us [ 23175] | _Feature();
4.857 us [ 23175] | __init__();
3.023 us [ 23175] | __init__();
...
[ 23175] | __call__() {
1.874 us [ 23175] | sub_debug();
247.328 us [ 23190] | } /* call */
263.652 us [ 23190] | } /* __del__ */
274.966 us [ 23190] | } /* _get_candidate_names */
[ 23175] | _terminate_pool() {
[ 23190] | next() {
2.227 us [ 23175] | debug();
11.997 us [ 23190] | } /* next */
2.020 us [ 23175] | debug();
[ 23175] | _help_stuff_finish() {
[ 23190] | next() {
1.303 us [ 23175] | debug();
8.293 us [ 23190] | } /* next */
32.261 us [ 23190] | choice();
[ 23190] | _get_candidate_names() {
2.660 us [ 23190] | __del__();
9.476 us [ 23190] | } /* _get_candidate_names */
8.093 us [ 23190] | next();
7.423 us [ 23190] | next();
7.067 us [ 23190] | choice();
225.588 us [ 23195] | } /* _put */
244.589 us [ 23195] | } /* put */
258.959 us [ 23195] | } /* apply_async */
15.040 us [ 23195] | _eintr_retry_call();
uftrace stopped tracing with remaining functions
================================================
task: 23175
[5] _help_stuff_finish
[4] _terminate_pool
[3] __call__
[2] _run_finalizers
[1] _exit_function
[0] _run_exitfuncs
task: 23190
[3] rng
[2] getenv
[1] _candidate_tempdir_list
[0] _newname
task: 23195
[3] _handle_exitstatus
[2] _set_cloexec_flag
[1] pipe_cloexec
[0] _get_default_tempdir
task: 23200
[1] __stop
[0] _exitfunc
It seems that it gets stuck duing pipe_cloexec.
$ uftrace replay --tid 23195
# DURATION TID FUNCTION
[ 23195] | check_serial_case() {
29.300 us [ 23195] | currentThread();
33.200 us [ 23195] | <module:t001_basic>();
[ 23195] | <module:runtest>() {
[ 23195] | __init__() {
5.113 us [ 23195] | __init__();
[ 23195] | RLock() {
...
[ 23195] | pipe_cloexec() {
[ 23195] | _set_cloexec_flag() {
13.141 us [ 23195] | _eintr_retry_call();
[ 23195] | _handle_exitstatus() {
9.667 us [ 23195] | _eintr_retry_call();
9.410 us [ 23195] | _internal_poll();
[ 23195] | apply_async() {
[ 23195] | put() {
225.588 us [ 23195] | _put();
244.589 us [ 23195] | } /* put */
258.959 us [ 23195] | } /* apply_async */
15.040 us [ 23195] | _eintr_retry_call();
uftrace stopped tracing with remaining functions
================================================
task: 23195
[3] _handle_exitstatus
[2] _set_cloexec_flag
[1] pipe_cloexec
[0] _get_default_tempdir
There seems to be a problem when storing data in a shared object.
The result of the test process is stored in the shared.results dictionary via the callback function save_test_result().
https://github.com/namhyung/uftrace/blob/322185a09c85e280733949db661ce975253aef7f/tests/runtest.py#L689-L697
https://github.com/namhyung/uftrace/blob/322185a09c85e280733949db661ce975253aef7f/tests/runtest.py#L552-L559
After each test is completed, the value stored in the shared object is displayed.
https://github.com/namhyung/uftrace/blob/322185a09c85e280733949db661ce975253aef7f/tests/runtest.py#L708-L714
However, when tracing using uftrace, data is not saved properly in the shared object.
Thus, because there is no value stored in shared.results, it cannot exit the while statement and becomes infinite loop.
Here is a PoC.
#!/usr/bin/env python
import multiprocessing
from functools import partial
def foo(name):
return name
def save(result, shared):
shared.results[result] = 1
num_list = ['p1', 'p2', 'p3', 'p4']
manager = multiprocessing.Manager()
shared = manager.dict()
shared.results = dict()
clbk = partial(save, shared=shared)
pool = multiprocessing.Pool(processes=2)
serial_pool = multiprocessing.Pool(1)
for name in num_list:
serial_pool.apply_async(foo, args=[name], callback=clbk)
pool.close()
pool.join()
print(shared.results)
Here is normal result.
$ python mp.py
{'p1': 1, 'p2': 1, 'p3': 1, 'p4': 1}
Run with uftrace.
$ uftrace record --python --no-libc ./mp.py
{}
@rls1004 Thanks very much for writing the simple reproducible script.
I've tested it with uftrace and got the same problem.
$ ./mp.py
{'p2': 1, 'p3': 1, 'p1': 1, 'p4': 1}
$ uftrace record --python --no-libcall ./mp.py
{}
However, it's not a problem of our uftrace implementation. It seems that the problem comes from the sys.settrace that the current implementation relies on.
The builtin feature also shows the same problem as follows:
$ python -mtrace --trackcalls ./mp.py
{}
calling relationships:
*** ./mp.py ***
--> /usr/lib/python2.7/functools.py
mp.<module> -> functools.<module>
--> /usr/lib/python2.7/multiprocessing/__init__.py
mp.<module> -> __init__.<module>
mp.<module> -> __init__.Manager
mp.<module> -> __init__.Pool
...
I've also tested it with python3, but it doesn't make any difference unfortunately.
We have to find another way or fix the problem in the mainline cpython implementation.
@rls1004 If it's difficult to fix the problem inside CPython, is it possible you to propose the correct tracing feature in RustPython community? The only thing we need is to have hooks for python function entry and exit.
In addition, our final goal is to provide the unified function trace across python script and its internal interpreter together.
The current sys.settrace is very heavy and puts a lot of overhead so I hope the python interpreter provides lightweight hooks for python script.
@rls1004 Thanks for making a bug report to python community.
- https://bugs.python.org/issue38511
Let's see if it can be fixed.