pyo3 icon indicating copy to clipboard operation
pyo3 copied to clipboard

Thread state is lost between calls to `Python::with_gil`

Open thejcannon opened this issue 3 years ago • 6 comments

Bug Description

(Stemming from https://github.com/microsoft/debugpy/issues/968)

I've tracked down my debugability-of-python issue to the fact that the thread state is not persisted in a thread between multiple calls to Python::with_gil. The call to PyGILState_Release at the end of the GILGuard hits https://github.com/python/cpython/blob/45e62a2bc1c0000e2e9b613fff6bebf2c26fcb93/Python/pystate.c#L1715 which then clears and deletes the thread-state.

I'm still working on understanding Python's thread behavior, but ideally the thread state persists as long as the thread does. Not sure the right CPython calls to make this happen.

Steps to Reproduce

  1. Python calling a Python Rust-implemented C Extension module using PyO3
  2. In the module, spawn a new thread
  3. In the new thread calling sys.settrace and then sys.gettrace in two calls to Python::with_gil will show that thread state was cleared in between

Backtrace

No response

Your operating system and version

Ubuntu 20

Your Python version (python --version)

Python 3.9

Your Rust version (rustc --version)

rustc 1.61.0

Your PyO3 version

0.16.5

How did you install python? Did you use a virtualenv?

apt

Additional Info

No response

thejcannon avatar Jul 06 '22 01:07 thejcannon

Just gonna toss this in the mix (160 being the offset of gilstate_counter in Py3.9 on 64-bit arch):

          let tstate: *mut u8 = unsafe {ffi::PyThreadState_GET().cast()};
          unsafe {(*(tstate.add(160) as *mut i32))+=1;};

Works exactly as expected. The thread state isn't cleared on PyGILState_Release, so down-the-line when my thread picks up some work and FFIs into Python, I can debug it (because the trace func is still legit)

thejcannon avatar Jul 06 '22 01:07 thejcannon

Ah yes, a (admittedly nasty) workaround!

runtime_builder.on_thread_start(|| {
  let _ =
    unsafe { ffi::PyThreadState_New(Python::with_gil(|_| ffi::PyInterpreterState_Get())) };
  Python::with_gil(|py| {
    let _ = py.eval("__import__('debugpy').debug_this_thread()", None, None);
  });
});

The chain of calls that makes this work AFAICT is:

  • PyThreadState_New
  • _PyThreadState_SetCurrent
  • _PyGILState_NoteThreadState
  • this line
    /* PyGILState_Release must not try to delete this thread state. */
    tstate->gilstate_counter = 1;

(Although, looks like PyInterpreterState_Get is 3.9+) Edit: I'm going with PyInterpreterState_Main since I don't use sub-interpreters

thejcannon avatar Jul 06 '22 13:07 thejcannon

Hmm, so I have to confess even if this is a bit surprising, it's unclear to me that PyO3 should be doing something different. I'm going to keep thinking because it would be good to get your use case working.

Effectively your proposal leaks the Python thread state, right? Does that matter? I confess I don't know.

Is there an API we could provide to meet your needs? I guess there could be a create_thread_state() helper which does loosely what you propose above.

Would it be unsafe? Maybe, I think it would depend whether leaking a thread state is sound. If this API would be unsafe anyway, does it offer much value add over what you propose above?

davidhewitt avatar Jul 06 '22 21:07 davidhewitt

I think the thread state objects are attached to Thread-Specific Storage (See https://github.com/python/cpython/blob/45e62a2bc1c0000e2e9b613fff6bebf2c26fcb93/Python/pystate.c#L1583) so my way of justifying this is we're just moving the creation of the thread state from PyGILState_Ensure outside (while assuming the same thread state object is what is retrieved from TSS in PyGILState_Ensure.

In this case I think the leakage is kosher, our app is Python -> Rust spawning threads, and those threads live until the the process exits (-ish). So there isn't much utility in releasing the thread state, as the process is exiting anyways.


As far as API goes, at a minimum I need a function which returns the current interpreter (for Py < 3.9) so that I can pass it to PyThreadState_New. Other than that I'd love to consume a nicer PyO3 API instead of CPython's but our code has a lot of concessions anyways, whats one more :joy:


https://github.com/pantsbuild/pants/pull/15951 ended up with the changes. Not much required to get it working, which is awesome.

thejcannon avatar Jul 07 '22 13:07 thejcannon

👍 Glad to hear you got this working. I'm still not entirely sure it's safe to expose a leak like this in the general case, though I'm going to keep this open so that at the very least we document this behaviour.

davidhewitt avatar Jul 12 '22 19:07 davidhewitt

Yeah once we eventually get to a minimum Python version of 3.9+ then I think PyO3 has given us all the tools we need.

Documentation would be great. I doubt most people care about attaching objects to the Python thread state which need to persist between calls, so contorting the code/API to support this wouldn't be ideal. I think since you also expose the lower-layer ffi functions we're all set.

thejcannon avatar Jul 13 '22 13:07 thejcannon