vmprof-python icon indicating copy to clipboard operation
vmprof-python copied to clipboard

Display native code frames.

Open markdewing opened this issue 10 years ago • 10 comments

This is useful for profiling extension modules and Cython. Support for Numba is coming (need to resolve the JIT addresses)

Filter out native code frames inside the Python library. Eventually other libraries may be filtered out as well (ctypes.so, etc.)

The default for 'vmprofshow' is now to show native frames (except under PyPy, in order to preserve existing behavior). The command line flag '--python_only' will limit the display to only the Python frames.

Add a simple extension module for testing mixed Python and native stacks. These are not connected to the test system.

Untested with vmprof-server, but the upload options are unchanged, so it should be okay.

markdewing avatar Nov 05 '15 19:11 markdewing

Hi Mark

After a massive rewrite, we decided to completely drop support for C frames together with libunwind due to libunwind problems. However, there is a path towards supporting libunwind "optionally", that is, when available on the platform. Would you be willing to help with such transition?

fijal avatar Jan 19 '16 09:01 fijal

Yes, I am willing to look into it. I assume you still need to walk the native stack at each sample to get to the Python stacks ? Are you using something other than libunwind, or do you have another way to get the Python part of the stack info?

markdewing avatar Jan 27 '16 14:01 markdewing

We use PyThreadState_Current to walk the python stack and we would need libunwind to read the C stack

fijal avatar Jan 28 '16 11:01 fijal

For some bits of the implementation, I did a module for perf sampling of Python code. The interesting function is record_traceback() in https://github.com/giraldeau/python-profile-ust/blob/master/ext/sampling.c

giraldeau avatar Jan 28 '16 15:01 giraldeau

vmprof does essentially the same thing (with the additional support for multiple operating systems and multuple VMs). Note that some of the functions you call might not be signal-handler-friendly

fijal avatar Jan 28 '16 15:01 fijal

Hum, does vmprof does record cache misses and instructions? I know that the prototype I did was limited in various ways, but it was for research/graduating purposes. Anyway, if you want to know more about the project, here is the (unpublished) paper: http://secretaire.dorsal.polymtl.ca/~fgiraldeau/misc/execution-path-profiling-wiley.pdf

For the signal safety, the strings are saved only if they are already in UTF-8, and no memory allocation occurs, which should be signal safe (the signal handler only reads memory). The only thing that bothers me is that pointer assignment to the frame structure can be reordered by the compiler, making visible (for instance) f_code before co_name is set. However, since the code is loaded upfront and we assumes that strings are in UTF-8, this situation should not occur un practice.

Anyway, I do not intend to maintain the prototype I did and if the good pieces (if such thing exists) can be ported to vmprof, then it's cool. Cheers!

giraldeau avatar Jan 28 '16 15:01 giraldeau

As far as I know, vmprof does not make use of hardware performance counters. If you ask me, vmprof is be the right platform for gathering and displaying such information. Pull requests welcome, I'm happy to review them.

planrich avatar Jul 28 '16 07:07 planrich

So from this conversation I get that native stacks are a pain to parse properly, but could we at least keep them in the output and just write "" or something? Because just dropping them will skew the profiling result, but showing them unparsed is still pretty useful I believe.

boxed avatar Nov 02 '18 23:11 boxed

@boxed Native frames are supported now.

I think this whole discussion is obsolete.

rlamy avatar Nov 03 '18 00:11 rlamy

Oh. Cool.

boxed avatar Nov 03 '18 00:11 boxed