cpython
cpython copied to clipboard
gh-115999: Implement thread-local bytecode and enable specialization for `BINARY_OP`
This PR implements the foundational work necessary for making the specializing interpreter thread-safe in free-threaded builds and enables specialization for BINARY_OP
as an end-to-end example. To enable future incremental work, specialization can now be toggled on a per-family basis. Subsequent PRs will enable specialization in free-threaded builds for the remaining families.
Each thread specializes a thread-local copy of the bytecode, created on the first RESUME, in free-threaded builds. All copies of the bytecode for a code object are stored in the co_tlbc
array on the code object. Threads reserve a globally unique index identifying its copy of the bytecode in all co_tlbc
arrays at thread creation and release the index at thread destruction. The first entry in every co_tlbc
array always points to the "main" copy of the bytecode that is stored at the end of the code object. This ensures that no bytecode is copied for programs that do not use threads.
Thread-local bytecode can be disabled at runtime by providing either -X tlbc=0
or PYTHON_TLBC=0
. Disabling thread-local bytecode also disables specialization.
Concurrent modifications to the bytecode made by the specializing interpreter and instrumentation use atomics, with specialization taking care not to overwrite an instruction that was instrumented concurrently.
- Issue: gh-115999