pybind11 icon indicating copy to clipboard operation
pybind11 copied to clipboard

[BUG]: Missing calls from a cProfile profile when calling into a pybind11 extension

Open itamaro opened this issue 8 months ago • 2 comments

Required prerequisites

  • [x] Make sure you've read the documentation. Your issue may be addressed there.
  • [x] Search the issue tracker and Discussions to verify that this hasn't already been reported. +1 or comment there if it has.
  • [x] Consider asking first in the Gitter chat room or in a Discussion.

What version (or hash if on master) of pybind11 are you using?

tried with both 2.13.6 and master

Problem description

Consider the following Python code using cProfile:

import cProfile
import sys

import ext

def call_me() -> None:
    obj = ext.CppClass()
    for _ in range(10):
        obj.call_me()

def main() -> None:
    print("Profiling...\n\n")
    cProfile.run("call_me()")

if __name__ == "__main__":
    main()

where ext is the following pybind11 extension:

#include <pybind11/pybind11.h>

namespace myspace {
class CppClass {};
} // namespace myspace

namespace py = pybind11;

PYBIND11_MODULE(ext, m) {
  m.doc() = "Simple C++ Python extension using pybind11";

  m.def("return_zero", []() { return 0; });

  py::class_<myspace::CppClass>(m, "CppClass")
      .def(py::init<>())
      .def("call_me", [](myspace::CppClass& self) {
        printf("hello!\n");
        return 0;
      });
}

Running this under Python 3.12.8 produces the following output:

Profiling...


hello!
hello!
hello!
hello!
hello!
hello!
hello!
hello!
hello!
hello!
         1 function calls in 0.000 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Notably, the resulting profile doesn't include anything about the Python call to call_me(), nor the 10 calls to the pybind11 call_me() method on the obj instance. We can see from the "hello" prints in the output that the method was indeed called.

Interestingly, when changing the call inside the loop to ext.return_zero() instead, the profile produced is as expected:

Profiling...


         14 function calls in 0.000 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 repro.py:11(call_me)
        1    0.000    0.000    0.000    0.000 {built-in method builtins.exec}
       10    0.000    0.000    0.000    0.000 {built-in method ext.return_zero}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Using Python 3.10 instead of 3.12, the second profile is identical, while the first one is still missing some calls, but it's not missing the initial Python call_me() call:

Profiling...


hello!
hello!
hello!
hello!
hello!
hello!
hello!
hello!
hello!
hello!
         4 function calls in 0.000 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 repro.py:11(call_me)
        1    0.000    0.000    0.000    0.000 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

I think this is likely pybind11-specific, because I tried reproducing this with an equivalent C++ extension using hand crafted C-API, and couldn't observe the issue.

Reproducible example code


Is this a regression? Put the last known working version here if it is.

Not a regression

itamaro avatar Apr 03 '25 22:04 itamaro

ChatGPT ideas:

  • https://chatgpt.com/share/67eed76e-7c20-8008-8f71-a9367824487b

🧪 Possible directions to investigate: Python 3.12 changes:

Python 3.12 introduced a lot of changes to the interpreter internals, including optimizations in frame evaluation (PEP 659, PEP 523 hooks, etc.).

If pybind11 doesn’t cooperate cleanly with these changes, it could mess with the profiler.

pybind11 behavior:

pybind11 generates trampoline functions that get invoked from Python. If the C++ lambda is inlined or not instrumented properly, cProfile might just "skip" over it.

If the transition to native code is not properly tracked, then the profiler might not resume correctly after the C++ call.

Profiler hooks and compiled code:

cProfile is written in C and relies on Python's internal frame evaluation hooks.

Calling C++ code via pybind11 bypasses Python bytecode execution, so the profiler may not “see” into the C++ part, but it should still see everything leading up to it.

🔍 Recommendations Try on Python 3.11 or 3.10 to confirm it's a Python 3.12 regression.

Use sys.setprofile() to set a manual profile hook—see if it gets the same behavior.

Try a more direct profiler like Py-Spy or Linux perf (Ralf mentioned this).

Consider opening an issue on the pybind11 or CPython bug trackers if you can make a minimal reproducible example.

rwgk avatar Apr 04 '25 00:04 rwgk

Some extra information only, no conclusion(s).

I update this link in ChatGPT:

  • https://chatgpt.com/share/67eed76e-7c20-8008-8f71-a9367824487b

In the update I pointed out this code to ChatGPT:

  • https://github.com/pybind/pybind11/blob/b70b8eb332fadf55d7e22b492da0e954c1a4fcb7/include/pybind11/pybind11.h#L618-L629

Note that this code was changed recently by PR #5580. — I do not know if that matters for this bug.

In the updated ChatGPT conversation, look for How this might break in Python 3.12

rwgk avatar Apr 06 '25 08:04 rwgk