py-spy icon indicating copy to clipboard operation
py-spy copied to clipboard

Incorrect PyConfig struct definition in v3_8_0.rs causes wrong field offset for eval_frame in PyInterpreterState

Open taifu001 opened this issue 4 months ago • 4 comments

There's an issue with the PyConfig struct definition in v3_8_0.rs that causes incorrect memory layout and wrong field offsets. Specifically:

  1. Direct cause: The struct includes an extra _config_version field that doesn't exist in the actual CPython 3.8 API, which increases the size of PyConfig and shifts the offsets of subsequent fields.
  2. Optimization points:The legacy_windows_stdio field is missing the #[cfg(target_os = "windows")] attribute, which should only be present on Windows platforms.

These issues cause the eval_frame field in PyInterpreterState to be at a wrong offset when accessing through the Rust struct definition. When trying to get the interpreter state from PyThreadState and then access its eval_frame field, the pointer arithmetic leads to incorrect memory location due to the oversized PyConfig struct shifting all subsequent fields' positions.

The fix involves:

  1. Removing the non-existent _config_version field from PyConfig
  2. Adding the proper conditional compilation attribute #[cfg(target_os = "windows")] to legacy_windows_stdio

This ensures the memory layout matches the actual CPython 3.8 structure, allowing correct access to the eval_frame function pointer for Python frame evaluation hooking.

taifu001 avatar Aug 07 '25 02:08 taifu001

What OS and cpu architecture are you using ? Also where are you installing python from?.

We test out py-spy in CI against python v3.8.0 on osx and windows . We also test py-spy with python v3.8.18 on ubuntu aarch64, ubuntu x86_64 and osx - so I suspect the existing generated bindings work at least some of the time

benfred avatar Aug 07 '25 22:08 benfred

Hi, thanks for following up.

I compared the PyConfig struct definition in v3_8_0.rs against the actual CPython 3.8.0 source code from the official release: 👉 https://github.com/python/cpython/releases/tag/v3.8.0 Specifically, I checked Include/cpython/initconfig.h, which defines the PyConfig struct.

The official CPython 3.8.0 PyConfig does NOT have a _config_version field. However, the Rust binding in v3_8_0.rs includes this field, which does not exist in the real C ABI. This causes the entire PyConfig struct to be 8 bytes larger than it should be on x86_64, leading to incorrect memory layout and misaligned field offsets in subsequent structs that contain PyConfig.

In my environment:

OS: CentOS 7.9 Architecture: x86_64 Python: 3.8.0 installed via Conda (Miniconda) I'm using the following code to access the interpreter state and eval_frame:

rust let py_thread_state_get = get_py_thread_state_get(); let tstate = py_thread_state_get(); if tstate.is_null() { return Err("Failed to get thread state for Python 3.8".into()); }

let tstate_struct = &*(tstate as *const v3_8_0::PyThreadState); let interp = tstate_struct.interp; let _is = interp as mut v3_8_0::PyInterpreterState; let eval_frame_function = (_is).eval_frame; When the _config_version field is present in v3_8_0::PyConfig, the offset of eval_frame within PyInterpreterState becomes incorrect — it no longer matches the expected offset of 568 bytes from the start of the PyInterpreterState struct.

Through debugging and memory inspection, I confirmed that:

Only when _config_version is removed, the eval_frame field appears at exactly offset 568. With _config_version present, the offset shifts (e.g., to 576), causing the read to land in the middle of another field — resulting in a null or invalid function pointer.

taifu001 avatar Aug 08 '25 08:08 taifu001

Can you explain a little more about your use-case?

I understand that the _config_version field has been removed (and is only present here since the bindings were generated on v3.8.0b4 before the stable release, and this field was removed before v3.8.0 was released). I'm good with fixing the generated bindings to match the stable release by removing this field, but I'm struggling to see why this is necessary.

The python_bindings module is not part of the public rust API for py-spy - and importing it should cause an error like:

error[E0603]: module `python_bindings` is private
  --> examples/test_import_bindings.rs:4:13
   |
4  | use py_spy::python_bindings::v3_8_0;
   |             ^^^^^^^^^^^^^^^  ------ module `v3_8_0` is not publicly re-exported
   |             |
   |             private module
   |

These bindings are automatically generated by bindgen and should be a private implementation detail of how py-spy works - and the only fields from the v3_8_0::PyInterpreterState that we access are ts_head and modules which occur before the PyConfig so won't be affected.

Are you trying to add a new feature to py-spy (like improve the native unwind code by leveraging eval_frame for merging the native and python stacks together)? I'm not sure why you need to access the eval_frame field here, since it should be private outside of py-spy itself.

benfred avatar Aug 09 '25 02:08 benfred

Thanks for the quick response and the clarification.

Context about my use case:

I’m building a tool that monitors Python process method/function calls across multiple Python versions at runtime. For an early prototype I wanted to use the eval_frame mechanism to get detailed call information. To move fast, I reused py-spy’s versioned python_bindings since they already encapsulate many per-version layout differences. That led me to notice the _config_version field mismatch in the v3_8_0 bindings. I understand python_bindings is private and not part of py-spy’s public API; my usage was an internal shortcut during prototyping. Current status:

I’ve since changed approach and no longer rely on py-spy’s bindings. I switched to a C++ solution that uses library-level functions and its own per-version handling to extract the necessary Python method information, so I’m not accessing eval_frame via py-spy anymore. I’m not proposing a new feature for py-spy, and I won’t depend on the private modules going forward. Reason for filing this:

Even though I’ve moved away from using the bindings, I wanted to let you know about the stale _config_version field so you could regenerate or adjust the 3.8 bindings to match the stable release. It’s purely a heads-up to help keep things clean and avoid confusion for others who might look at the code. Thanks again for the explanation and for maintaining py-spy.

taifu001 avatar Aug 11 '25 09:08 taifu001