gh-111997: C-API for signalling monitoring events
- Issue: gh-111997
Do I understand correctly that the CodeLike type is expected to be provided by user code?
I think we should provide CodeLike in some form, but for now I put it in the text file so we can talk about how and where we want it to be. The offset to location mapping should also be part of this.
I don't see why this public API should live in an internal header file. And why the underscore names? We only use(d) them for non-public functions.
@markshannon did you intend for these to be private?
@markshannon did you intend for these to be private?
My thinking was that it is better to keep any function private unless we need it to be public. It sounds like they need to be public.
How about this?
- The declaration of the implementation function,
_PyMonitoring_FirePyStartEvent(), goes in the semi-privatecpython/folder. - We expose an inline function calling for the unstable API and a non-inline one for the stable API in the public header.
Something like this:
#ifndef Py_LIMITED_API
static inline int
PyMonitoring_FirePyStartEvent(PyMonitoringState *state, PyObject *codelike, uint32_t offset)
{
if (state->active)
return _PyMonitoring_FirePyStartEvent(state, codelike, offset);
else
return 0;
}
#else
extern int
PyMonitoring_FirePyStartEvent(PyMonitoringState *state, PyObject *codelike, uint32_t offset);
#endif
The "code like" object is entirely opaque to the monitoring machinery. As far as it is concerned, a location is a pair of an object and an integer index.
Debuggers, profilers, etc need to present that object, int as a meaningful location. If the object is a code object, they can already do that using co.co_filename and co.co_positions().
The point of a "code like" object is to provide the necessary interface for tools (debuggers, profilers, etc) to turn object, int into a meaningful source location.
So, no C struct, but a Python abstract base class.
When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.
In case my earlier comments were a bit cryptic:
- The
activefield should only be modified by the VM, so tests will need to callset_eventsbefore and after the recording run. - The Python API provided by
_testcapishould be as thin a layer as possible around the underlying C function. - It may help the tests to implement a
tp_callfunction that callsPyMonitoring_BeginScope, then fires events.
Finally, maybe we should add a PyMonitoring_ExitScope, as it would allow the VM to track the current code-like object?
Finally, maybe we should add a
PyMonitoring_ExitScope, as it would allow the VM to track the current code-like object?
What do you mean by track?
When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.
Thanks for making the requested changes!
@markshannon: please review the changes made to this pull request.
What do you mean by track?
Stash on the thread state or in the frame stack, perhaps.
The idea being that the "code-like" object can be displayed in tracebacks and the like.
Nothing may come of this, but we can't add it after PyMonitoring_BeginScope is added to the API, and it would add symmetry.
When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.
I have made the requested changes; please review again.
Thanks for making the requested changes!
@markshannon: please review the changes made to this pull request.
This is a very big addition to the limited API & stable ABI. Does it need to be added there?
Even if it is, I recommend that the initial pull request only adds it to public C API (in Include/cpython/, with the #include guarded in #ifndef Py_LIMITED_API, like here: https://github.com/python/cpython/blob/b3e8c78ed7aa9bbd1084375587b99200c687cec9/Include/object.h#L1211-L1215).
Note that Include/cpython/ isn't “semi-private”. It's a place to put declarations guarded by #ifndef Py_LIMITED_API.
The tiers of the C API are described in https://devguide.python.org/developer-workflow/c-api/
I'm confused by having both PyMonitoring_EnterScope and PyUnstable_Monitoring_EnterScope. I don't see a reason to add a function to both the limited and unstable tier. Pick one, based on how long you want to support the API.
I'm confused by having both
PyMonitoring_EnterScopeandPyUnstable_Monitoring_EnterScope.
The Unstable one is optimized (
I'm confused by having both
PyMonitoring_EnterScopeandPyUnstable_Monitoring_EnterScope. I don't see a reason to add a function to both the limited and unstable tier. Pick one, based on how long you want to support the API.
It wasn't necessary for EnterScope/ExitScope, I removed those. We still have the two versions for the Fire*Event functions, the Unstable one is faster.
This is a very big addition to the limited API & stable ABI. Does it need to be added there?
I've been wondering too. Should we keep this simpler and more contained for now? @markshannon, @scoder ?
If you don't include this in the stable ABI, there'll be no need for the PyUnstable_ versions.
If you do, you still only need one name: choose the implementation based whether the user is compiling for the stable ABI. (But again, I recommend adding to stable ABI this in a later commit, if at all.)
This is a very big addition to the limited API & stable ABI. Does it need to be added there?
It's functionality that is not otherwise available. Given that it was never usable in the Limited API so far, we'd stay with the status quo that profiling, debugging and coverage-testing non-Python code are not possible there.
I don't think that's desirable, but for debugging and coverage, it might be acceptable to compile and run them outside of the Limited API. For profiling, it'd be annoying but possible at least for Cython users to resort to native profilers. They wouldn't give you source level profiling, but would still hint at the right code spots.
At least, since these are developer features rather than end user features, adding them in a later Limited API release would allow many developers (those who can rebuild their dependencies from sources) to make use of them directly in that version, without having to wait for their minimum supported stable ABI to catch up on end user side.
@scoder @encukou @markshannon I removed the limited API part, so now we just have the inlined version. Can we all agree on this?
When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.
I have made the requested changes; please review again.
Thanks for making the requested changes!
@markshannon: please review the changes made to this pull request.
Thanks! This looks good, I'd just move things around. See https://github.com/iritkatriel/cpython/pull/56 for how I'd organize the headers.
Not as it is, as there's no documentation. Without that it's hard to tell if the API does what it should do, or what codelike or version should be.
Not as it is, as there's no documentation. Without that it's hard to tell if the API does what it should do, or what codelike or version should be.
I added docs, please review.
codelike is not part of the API (it's just an object that is passed through to the callback). We may add a codelike ABC, but it's a separate story.
So, state_array and a corresponding version should be allocated and used together; they must be valid from the call PyMonitoring_EnterScope to the end of the corresponding PyMonitoring_ExitScope. Typically they are part of a CodeObject. Is that right?
Yes, that's correct.
Or can one have multiple scopes per “code object”? The Enter/Exit terminology makes me think of frames rather than code objects. If I emulate a code object for a recursive function, should PyMonitoring_EnterScope be called when the codelike is created, or when it's (first?) called?
By "scope" I mean lexical scope. Typically a function, but could be a class or method. PyMonitoring_EnterScope should be called whenever the lexical scope is entered.
For a code generator like Cython, the call to PyMonitoring_EnterScope would be placed at the start of the generated C function, and the call to PyMonitoring_ExitScope at the end of the function, just before the return.
The only reason a static variable couldn't be used is sub-interpreters. If a static variables were used, enabling an event in one interpreter would enable it in all.
@encukou are you OK with this being merged?
I'm still writing up the initial implementation in Cython, but so far, ISTM that the version passed into PyMonitoring_EnterScope() is useless, at least for me. I cannot store it statically in C, nor in another global place (like the function object or code object). I can really only make it a local variable that is always 0 (as is the PyMonitoringState array), which defeats its purpose. Any ideas how to make actual use of it?
Another thing that I noticed is that it's unclear how the tracing status and the exception state is managed. Meaning, am I (as a user) expected to disable tstate->tracing before calling into PyMonitoring_Fire*()? And/or save and restore the exception state?
My first try was to have the traced function and the event handler in the same (compiled) file, which meant that the event handler was also instrumented to report events, thus running into this assertion:
#6 0x00007ffff7cc2e96 in __GI___assert_fail (assertion=assertion@entry=0x555555a5b9b5 "tstate->tracing == 0", file=file@entry=0x555555a5b840 "Python/instrumentation.c", line=line@entry=855, function=function@entry=0x555555a5cc70 <__PRETTY_FUNCTION__.56> "call_one_instrument") at ./assert/assert.c:101
#7 0x00005555558b4d87 in call_one_instrument (interp=<optimized out>, tstate=0x555555c31608 <_PyRuntime+280776>, args=0x7fffffffa898, nargsf=-9223372036854775806, tool=<optimized out>, event=0) at Python/instrumentation.c:850
#8 call_one_instrument (event=0, tool=<optimized out>, nargsf=-9223372036854775806, args=0x7fffffffa898, tstate=0x555555c31608 <_PyRuntime+280776>, interp=<optimized out>) at Python/instrumentation.c:850
#9 capi_call_instrumentation (state=state@entry=0x7fffffffa914, codelike=codelike@entry=0x7ffff4f694e0, offset=offset@entry=70, args=args@entry=0x7fffffffa890, nargs=<optimized out>, nargs@entry=2, event=event@entry=0) at Python/instrumentation.c:2318
#10 0x00005555558ba2ea in _PyMonitoring_FirePyStartEvent (state=state@entry=0x7fffffffa914, codelike=codelike@entry=0x7ffff4f694e0, offset=offset@entry=70) at Python/instrumentation.c:2375
#11 0x00007ffff6c09d1f in PyMonitoring_FirePyStartEvent (offset=70, codelike=0x7ffff4f694e0, state=0x7fffffffa914) at /home/stefan/ablage/software/Python/python-git/Include/cpython/monitoring.h:125
#12 __Pyx__TraceStart (state_array=state_array@entry=0x7fffffffa914, code_obj=0x7ffff4f694e0, firstlineno=firstlineno@entry=70, is_generator=is_generator@entry=0) at sys_monitoring.c:1685
#13 0x00007ffff6c1ae1e in __pyx_pf_14sys_monitoring_16monitored_events_count_event (__pyx_self=__pyx_self@entry=0x7fffed974210, __pyx_v_event=0x555555bf1b20 <_PyRuntime+19936>, __pyx_v_code_obj=0x7ffff4f6bd30, __pyx_v_offset=0x555555bf26e0 <_PyRuntime+22944>, __pyx_v_args=__pyx_v_args@entry=0x555555c03ec0 <_PyRuntime+94592>) at sys_monitoring.c:4347
#14 0x00007ffff6c1b340 in __pyx_pw_14sys_monitoring_16monitored_events_1count_event (__pyx_self=0x7fffed974210, __pyx_args=<optimized out>, __pyx_nargs=3, __pyx_kwds=0x0) at sys_monitoring.c:4316
#15 0x00007ffff6c0707e in __Pyx_CyFunction_Vectorcall_FASTCALL_KEYWORDS (func=0x7fffed974210, args=0x7fffffffab60, nargsf=<optimized out>, kwnames=0x0) at sys_monitoring.c:15028
#16 0x00005555559b3245 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=3, args=0x7fffffffab60, callable=0x7fffed974210, tstate=0x555555c31608 <_PyRuntime+280776>) at ./Include/internal/pycore_call.h:168
#17 partial_vectorcall (pto=<optimized out>, args=0x7fffffffab68, nargsf=<optimized out>, kwnames=0x0) at ./Modules/_functoolsmodule.c:240
#18 0x00005555558b4b8f in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=9223372036854775810, args=0x7fffffffab68, callable=0x7fffeda94130, tstate=0x555555c31608 <_PyRuntime+280776>) at ./Include/internal/pycore_call.h:168
#19 call_one_instrument (event=0, tool=<optimized out>, nargsf=-9223372036854775806, args=0x7fffffffab68, tstate=0x555555c31608 <_PyRuntime+280776>, interp=<optimized out>) at Python/instrumentation.c:863
I think it would be nicer if the API took over the responsibility of guarding against (infinite) recursive tracing of event handler functions, if needed. I think it's sensible to disable tracing before starting to dispatch the events that are being traced to handlers.
So, state_array and a corresponding version should be allocated and used together; they must be valid from the call PyMonitoring_EnterScope to the end of the corresponding PyMonitoring_ExitScope. Typically they are part of a CodeObject. Is that right?
Yes, that's correct.
Or can one have multiple scopes per “code object”? The Enter/Exit terminology makes me think of frames rather than code objects. If I emulate a code object for a recursive function, should PyMonitoring_EnterScope be called when the codelike is created, or when it's (first?) called?
By "scope" I mean lexical scope. Typically a function, but could be a class or method.
PyMonitoring_EnterScopeshould be called whenever the lexical scope is entered.For a code generator like Cython, the call to
PyMonitoring_EnterScopewould be placed at the start of the generated C function, and the call toPyMonitoring_ExitScopeat the end of the function, just before thereturn.
I think this the kind of info that users need to know to use the feature. It should be in the documentation, along with info that:
- scopes can be nested (e.g. when emulating recursive functions), reusing the same state_array and version
- scopes need to be exited and re-entered when the code-like's execution is paused (e.g. when emulating a generator or
asyncfunction)
(Assuming that's correct. In the implementation, “Enter” reads more like a “cache update” and “Exit” is a no-op, and I'm having some trouble reverse-engineering how that's intended to map to lexical scopes. What is the Exit reserved for?)
ISTM that the
versionpassed intoPyMonitoring_EnterScope()is uselessscopes need to be exited and re-entered when the code-like's execution is paused
Ah, right. I hadn't looked into generator/async functions yet for the implementation. I can see the version being helpful there. They keep their execution state, so there is a place for them to also store away their profiling state.