cpython
cpython copied to clipboard
gh-118926: Deferred reference counting GC changes for free threading
This PR mainly introduces GC changes to the free threading GC to support deferred reference counting in the future.
To get this to work, new stack references must immediately live on the stack, without any interfering Py_DECREF
or escaping code between when we put them on the stack. This ensures they are visible to the GC.
This also NULLs out the rest of the stack because the GC scans the entire stack. The stack pointer may be inconsistent between escaping calls (including those to Py_DECREF
) so we need to scan the whole stack.
This PR removes the temporary immortalization introduced previously in #117783. I wanted to do this in a separate PR, but the only way to test this properly is to remove that hack. So it has to be bundled in this PR.
Finally, this PR fixes a few bugs in steals and borrows. This was only caught by the GC changes, not by the debugger that I was working on. Since these are untestable without the GC changes, I bundled them in.
Temporary perf regressions (6 threads):
object_cfunction PASSED: 3.6x faster
cmodule_function FAILED: 2.1x faster (expected: LOAD_ATTR from dict)
generator PASSED: 3.7x faster
pymethod FAILED: 2.5x faster (expected: LOAD_ATTR from pytype_lookup)
pyfunction FAILED: 2.9x faster (expected: LOAD_GLOBAL)
module_function FAILED: 2.9x faster (expected: LOAD_ATTR from module dict)
load_string_const PASSED: 4.0x faster
load_tuple_const PASSED: 3.8x faster
create_closure FAILED: 2.3x faster (unsure why)
create_pyobject FAILED: 1.5x faster (expected: LOAD_GLOBAL)
- Issue: gh-118926