cpython
cpython copied to clipboard
gh-91053: Add an optional callback that is invoked whenever a function is modified
JIT compilers may need to invalidate compiled code when a function is modified (e.g. if its code object is modified). This adds the ability to set a callback that, when set, is called each time a function is modified.
- Issue: gh-91053
pyperformance results are perf-neutral.
I also wrote a microbenchmark that attempts to measure the overhead on function creation when no callbacks are present. Happy to use a different benchmark and/or do this a bit more rigorously if there is concern. Results on the microbenchmark are perf-neutral as well:
This PR with no function watchers set:
mpage@ip-172-31-41-159:~$ ./cpython/python bench_comprehensions.py
0.08847772800072562
mpage@ip-172-31-41-159:~$ ./cpython/python bench_comprehensions.py
0.09016199100005906
mpage@ip-172-31-41-159:~$ ./cpython/python bench_comprehensions.py
0.08949500100061414
mpage@ip-172-31-41-159:~$ ./cpython/python bench_comprehensions.py
0.09101104000001214
mpage@ip-172-31-41-159:~$ ./cpython/python bench_comprehensions.py
0.08842268300213618
mpage@ip-172-31-41-159:~$
Upstream:
mpage@ip-172-31-41-159:~$ ./upstream/python bench_comprehensions.py
0.08706012900074711
mpage@ip-172-31-41-159:~$ ./upstream/python bench_comprehensions.py
0.09116156500022043
mpage@ip-172-31-41-159:~$ ./upstream/python bench_comprehensions.py
0.08924523199675605
mpage@ip-172-31-41-159:~$ ./upstream/python bench_comprehensions.py
0.0921605939984147
mpage@ip-172-31-41-159:~$ ./upstream/python bench_comprehensions.py
0.08725003300060052
All benchmarks were run on a bare metal machine in AWS that was configured using pyperf system tune
.
The failing test appears unrelated to this diff. Is there a way that I can re-run the test without adding additional commits?
The failing test appears unrelated to this diff. Is there a way that I can re-run the test without adding additional commits?
You can merge in main
You can merge in
main
Or any core dev (and probably triager) can trigger a rerun, which I did already and it's now passed.
PyFunction_SetDefaults
should emit aPyFunction_EVENT_MODIFY_DEFAULTS
event, no?PyFunction_SetKwDefaults
should emit aPyFunction_EVENT_MODIFY_KWDEFAULTS
event, no?
Yep, good catch!
- Why no events for
func_closure
andfunc_annotations
modifications?
func_closure
is read-only and func_annotations
isn't useful for our optimization purposes.
Also, please add the Python test code to
Lib/test/test_capi.py
instead of adding a newLib/test/test_func_events.py
, like the dict watchers did.
Sure thing, will do.
I don't like having watchers for function creation and destruction.
Having those watchers adds overhead for every creation of a comprehension or nested function, which could be an issue for performance for two reasons:
- The overhead. While the overhead is hard to measure now, that is more a comment on how inefficient function object creation currently is, rather than the overhead being negligible.
- It prevents the removal of functions by the optimizer. For example, there is no reason for a function to be created for a list comprehension. An optimizer could avoid creating the function and preserve semantics. If creation of a function is always observable, this optimization becomes impossible.
Discussed offline with @markshannon and reached these conclusions (Mark please let me know if any of this doesn't match your understanding):
- In the long term it might be better if instead of setting function vectorcall (and thus needing func-created hook so we can set it), we could rely on the core runtime to tell us when a function is hot and defer to our execution in that case. (This can be made compatible with pre-fork compilation if we eagerly compile code objects when they are created, and then just pull the pointer to the compiled code out of our cache when the runtime says "this function is hot, please execute it for me.") But this is all hypothetical at the moment and still requires a lot of design, implementation, and verification that it actually works, so it's a bit premature to rely on this solution.
- The "observability of function creation" issue doesn't need to be a blocker here. The documentation for the hook can be clear that optimized execution of Python code is free to elide function-object creation where possible, and if function object creation is optimized away, no function-created hook will be fired, and this is not considered a semantic change.
- The overhead here really isn't much at all compared to the existing work of creating a function object. Let's try to optimize it a bit more so that in the no-watchers case it's just a single read/test (per Mark's inline comment), and that'll be good enough.
We also discussed a possible alternative design where code objects have a vectorcall pointer and function objects just copy it from the code object; then we wouldn't need func-created hook. But this also adds an extra read to function creation, as well as making code objects bigger, so it's not clear this is really a better approach.
LGTM, thanks! I'll wait for @markshannon's thumbs up before merging.