cpython gh-118926: Deferred reference counting GC changes for free threading

gh-118926: Deferred reference counting GC changes for free threading

Open Fidget-Spinner opened this issue 7 months ago • 6 comments

This PR mainly introduces GC changes to the free threading GC to support deferred reference counting in the future.

To get this to work, new stack references must immediately live on the stack, without any interfering Py_DECREF or escaping code between when we put them on the stack. This ensures they are visible to the GC.

This also NULLs out the rest of the stack because the GC scans the entire stack. The stack pointer may be inconsistent between escaping calls (including those to Py_DECREF) so we need to scan the whole stack.

This PR removes the temporary immortalization introduced previously in #117783. I wanted to do this in a separate PR, but the only way to test this properly is to remove that hack. So it has to be bundled in this PR.

Finally, this PR fixes a few bugs in steals and borrows. This was only caught by the GC changes, not by the debugger that I was working on. Since these are untestable without the GC changes, I bundled them in.

Temporary perf regressions (6 threads):

object_cfunction PASSED: 3.6x faster
cmodule_function FAILED: 2.1x faster (expected: LOAD_ATTR from dict)
generator PASSED: 3.7x faster
pymethod FAILED: 2.5x faster (expected: LOAD_ATTR from pytype_lookup)
pyfunction FAILED: 2.9x faster (expected: LOAD_GLOBAL)
module_function FAILED: 2.9x faster (expected: LOAD_ATTR from module dict)
load_string_const PASSED: 4.0x faster
load_tuple_const PASSED: 3.8x faster
create_closure FAILED: 2.3x faster (unsure why)
create_pyobject FAILED: 1.5x faster (expected: LOAD_GLOBAL)

Issue: gh-118926

Jul 03 '24 10:07 Fidget-Spinner

cpython cpython copied to clipboard

gh-118926: Deferred reference counting GC changes for free threading

cpython
cpython copied to clipboard