Support Free Threading in Python 3.13
Is your feature request related to a problem? Please describe. Python 3.13 introduces an (experimental) free threading option (aka "NO_GIL"). PyObjC should support this feature.
Describe the solution you'd like
TL/DR: This is a lot more work than "build extension modules using a free threaded build"
Adding this requires a number of changes to PyObjC and the build process:
- [x] Ensure that all extension build in a free threaded configuration
- [x] Migrate all framework extension modules to 2 phase init
- [x] Migrate pyobjc-core to 2 phase init
- [x] Check all framework extension modules for possible threading issues and enable free threading for them (look for Py_MOD_GIL_USED)
- [x] Add fine grained locking in pyobjc-core, and then enable free threading
- [x] Update the testing harness to also test a free threading build for 3.13 (both architectures).
- [x] Update the distribution building script to also build free threaded extension modules
This one and the previous one are implemented by updating
_common_definitions.py - [x] Add tests that check that the GIL is actually disabled (explicit tests in pyobjc-core, include check in framework binding sanity check as well).
- [ ] Add documentation about support for free threading
- [x] Add test scripts that stress tests the free threading support and run those with TSAN enabled
- [ ] Review all critical section to ensure they are minimal and don't invoke arbitrary code.
The hard part is updating pyobjc-core, that code relies on module global state that's protected by the GIL. A somewhat easy option is to switch to a "global pyobjc lock" in the short term and slowly peel of bits that can use more fine grained locking.
See also:
- https://py-free-threading.github.io/porting/
Various changes:
- Drop usage of PySequence_Fast, this API can already be problematic with the GIL (when the argument is a list it is returned as-is, which means the borrowed reference returned from ..._GET_ITEM can get stale when the list is mutated; there is no alternative API that avoids the borrowed reference)
- Drop usage of PyModule_GetDict: Returns a borrowed reference. Use of this API turned out to be unnecessary in the first place.
- Lock shared global state, even if that state is only used by APIs that are limited to the main thread according to the API contract in Apple's frameworks.
- Don't return borrowed references from internal APIs.
- Use Py_BEGIN_CRITICAL_SECTION to protect object state for mutable values implemented in C.
- Use PyDict_GetItemRef instead of PyDict_GetItem/PyDict_GetItemWithError
- Don't use PyList_ macro's
- Switch to PyList_GetItemRef (with backward compatibility stub)
- NSMapTable is not thread-safe according to Apple's documentation (https://developer.apple.com/library/archive/documentation/Cocoa/Conceptual/Multithreading/ThreadSafetySummary/ThreadSafetySummary.html)
For the framework bindings there is in general no global state to protected (other than the ObjC runtime, which is thread safe and is accessed through pyobjc-core).
There are a number of frameworks with extension contexts for which the implementation uses python tuples to store information. Those should be as safe without the GIL as with the GIL because the context info is owned by whatever object performs the callback. There is a very small risk at crashes when the owning object gets released, that's unchanged with and without the GIL and I'm not yet convinced that there's a window for a race condition in the first place.
With changeset 5058104819ec7cceb8232181aa802bb1a2d71e6a the extension modules in the various framework bindings should be thread safe even in free-threading mode.
That's still only a first step toward supporting that mode, the core bridge still needs to be converted and that's more involved that the fairly trivial extension modules in framework bindings.
Also dropped usage of PyDict_*String APIs that make it easier to convert to PyDict_GetItemRef (needed for PyDict_GetItemString, other String APIs were changed for consistency).
Thanks for working on this. I'm eager for there to be something that builds and installs, even if it crashes or is unstable when it runs.
I've opted into Python 3.13t as my main Python, including my shell and test runners, etc, but I've had to fall back to another Python in many cases because pyobjc simply fails to build. In many cases, the dependency is unused, but its presence blocks the depending project on installing.
Installing from source, I'm also affected by https://github.com/ronaldoussoren/pyobjc/issues/620, so it's possible once that's addressed, I'll be able to build from source and get an unstable install.
I share all of this just so you're aware of my situation. Unless there's something trivially easy you can do, I'm not looking for any support or changes. I'm mainly aiming to share the status from my perspective. I appreciate your work.
I'll push out a release for #620 soonish, but other than that I won't release wheels for the free threaded build of Python until I'm somewhat sure that pyobjc-core actually supports running without the GIL.
I'm getting closer to that, but need to finish work on a locking strategy and internal API for the two mappings between original objects and their proxy (from Python to ObjC and v.v.). Those mappings are needed to preserve identity, which is necessary for correctness.
The hardest part at this point is finding the time to actually do the work (famous last words...).
FYI: The upcoming 10.3.2 release will include wheels for the free threaded variant of Python, but will still require the GIL. Even this required changes to the code due to using the limited API in framework bindings.
Big step towards finishing this work: the proxy-registry (mapping from a Python object to its Objective-C proxy, and mapping an Objective-C object to its Python proxy) is now compatible with free-threading.
(Still 19 files to go for the audit)
OC_Python* classes intentionally do not use locking for accessing their state. That's primarily because these values are immutable. There's a small window for seeing incomplete values in the iniWithCoder: implementation, but that's not something one can fix with locking.
Gettting closer, but not quite there yet a fairly trivial script using threading fails without the GIL and passes with it enabled (basically iterating over an NSArray instance concurrently in a couple of threads).
I've added some standalone test script that demonstrate problems.
Gettting closer, but not quite there yet a fairly trivial script using threading fails without the GIL and passes with it enabled (basically iterating over an NSArray instance concurrently in a couple of threads).
I've added some standalone test script that demonstrate problems.
Those problems are now fixed, feeling better about this...
Hmmm.... The overhead for free-threading is large enough to slow things down a lot when running without a GIL. That said, when I don't convert the list of an NSArray I get a similar slowdown)
% time python -Xgil=1 concurrent-sum.py; time python -Xgil=0 concurrent-sum.py
python -Xgil=1 concurrent-sum.py 11.29s user 22.32s system 192% cpu 17.436 total
python -Xgil=0 concurrent-sum.py 15.10s user 32.63s system 534% cpu 8.926 total
The script:
import threading
from Cocoa import NSObject, NSArray
import objc
N_THREAD=8
v = NSObject.alloc().init()
t_list = []
r_list = []
def f(a, b):
b.wait()
s = 0
for i in a:
s += i
bar = threading.Barrier(N_THREAD)
arr = NSArray.arrayWithArray_(list(range(1000000)))
for _ in range(N_THREAD):
t = threading.Thread(target=f, args=(arr, bar,))
t_list.append(t)
t.start()
for t in t_list:
t.join()
Recently ran tests with a TSAN enabled build of both PyObjC and CPython, this found a number of problems (some of which are in CPython itself). The PyObjC ones are mostly fixed, which required rewriting the handling for initializer methods in Objective-C...
091f26944aa38061eca3d0e2345495f75002bd91, and before that 649a0206f30cd343dfc3daa01e72f6c75ff902f9 fix a race condition related to the proxy registry. The former changeset only works with Python 3.14 due to using a new API in that version.