bitarray icon indicating copy to clipboard operation
bitarray copied to clipboard

Support for free-threading CPython

Open rgommers opened this issue 9 months ago • 9 comments

Hi @ilanschnell 👋🏼. We working on support for free-threaded CPython (see, e.g., PEP 703 and https://py-free-threading.github.io/) and found bitarray in one of the dependency trees we're looking at.

Here is what adding support for free-threading typically takes:

  • [ ] Add a regular test job that runs the free-threaded interpreter in CI. Use pytest-run-parallel to find potential issues (global state and other potential thread safety issues), and fix them. Note: bitarray currently only has tests running in wheel build jobs, so perhaps adding this in CI is not desirable - it can be done locally as well.
  • [ ] Audit Python bindings and declare them free-threading compatible (xref https://py-free-threading.github.io/porting-extensions/).
  • [ ] Run the test suite under ThreadSanitizer (locally, once, not in CI - just scan for issues).
  • [ ] Update the CI job that builds wheels to include free-threading wheels (cp313t ABI tag).

It'd be great for bitarray to gain support, and @haozeke should have bandwidth soon to contribute this, if that sounds good to you. If so, any pointers would be very welcome e.g., (things you know may be problematic and would need testing/changing).

rgommers avatar May 13 '25 19:05 rgommers

Hi Ralf! Thanks for taking the initiative to support free-threaded CPython. I am not very familiar with free-threaded CPython and will start by reading up on the links you included. This sounds like a mayor project, but luckily bitarray has an extensive test suite, which I keep improving. This will allow making sure that nothing breaks while adding free-threaded support. As long as the changes required are not too invasive and the performance of standard CPython isn't suffering, I'm all for it!

ilanschnell avatar May 14 '25 23:05 ilanschnell

Thanks Ilan! I expect it won't be too bad (🤞🏼) - it was a lot of work figuring out how to add free-threading support to new projects when we started, but now all the patterns and test tools have been worked out. So the main thing is dealing with bitarray-specific thread-safety issues.

rgommers avatar May 16 '25 03:05 rgommers

So far there's one flakey data race @ngoldbaum found here and a bug which seems to be unrelated to data races at the very least.

unittest-ft -r -f -s bitarray.test_bitarray                                                                                                                                                   
................................................................................................................................................................................................
...................................................................................................................................F                                                            
=================================================================================================================                                                                               
FAIL: test_bitarray_shared_sections (bitarray.test_bitarray.BufferImportTests.test_bitarray_shared_sections) (x1)                                                                               
-----------------------------------------------------------------------------------------------------------------                                                                               
Traceback (most recent call last):                                                              
  File "/home/rgoswami/Git/Github/Quansight/freeThreading/bitarray/.uvftci/lib/python3.13t/site-packages/bitarray/test_bitarray.py", line 4921, in test_bitarray_shared_sections                
    self.assertEqual(a[8 * 0x100 : 8 * 0x300], b)                                               
    ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                               
AssertionError: bitarray('1001110101110110110111011000101001100000000[4050 chars]011') != bitarray('1011100101101110101110110101000100000110111[4050 chars]010')                                
-----------------------------------------------------------------------------------------------------------------                                                                               
Ran 324 tests in 238.74ms (saved 1.301s)                                                                                                                                                        
                                                                                                                                                                                                
FAILED (failures=1)                                                                          

HaoZeke avatar Jun 15 '25 11:06 HaoZeke

@HaoZeke I've approved your workflow runs for PR https://github.com/ilanschnell/bitarray/pull/235. I'm not sure why test_bitarray_shared_sections is failing (not sure what is done exactly by running the tests with unittest-ft). What the test does is creating bitarray objects (b and c) which point to different sections of the same memory buffer of bitarray a.

ilanschnell avatar Jun 15 '25 18:06 ilanschnell

Hi! I think I'm going to take over finishing #235. @HaoZeke I hope you don't mind. I think I'm going to squash and rebase Rohit's changes but otherwise will maintain his authorship on the first batch of work without edit.

I think there's still one big issue that we can straightforwardly fix. @ilanschnell - what do you think about replacing the default_endian static global with a context variable? There's also a C API for working with them from C. That way you could expose the endianness control publicly as a context manager:

from bitarray import default_endian
with default_endian('little'):
    ...

We did similar refactors over in NumPy to store the floating point error configuration and the printoptions configuration and it seems to work well. A nice benefit of storing state like this in a context variable is that the interface automatically becomes both thread-safe and async-safe.

ngoldbaum avatar Oct 23 '25 17:10 ngoldbaum

So far there's one flakey data race @ngoldbaum found here and a bug which seems to be unrelated to data races at the very least.

I suspect this failure is happening because of https://github.com/python/cpython/issues/127716 - memoryview itself has thread safety issues at the moment.

I'm not sure what the upstream status is, I pinged one of the CPython core devs on my team for more information. It looks like there isn't an easy way to exclude tests from multithreaded testing with unittest-ft (https://github.com/amyreese/unittest-ft/issues/42, unless I'm missing something), so it may not be the best choice here to enable this.

Instead, I think it makes more sense to try writing an explicitly multithreaded test. I'll take a look at that next.

what do you think about replacing the default_endian static global with a context variable?

I went ahead and opened https://github.com/ilanschnell/bitarray/pull/245 so you can see what this approach looks like.

ngoldbaum avatar Oct 23 '25 19:10 ngoldbaum

I suspect this failure is happening because of https://github.com/python/cpython/issues/127716 - memoryview itself has thread safety issues at the moment.

It turns out this is completely wrong.

Instead it's a race between tests that use _set_default_endian and test_bitarray_shared_sections. If I comment out all other tests in that file besides test_bitarray_shared_sections and _set_default_endian, I can trigger the failure.

Even if _set_default_endian changes the state in a context variable, it's still changing the state for the global default context, and that's thread-unsafe. The only safe way to isolate this stuff with a context variable is by using a context manager.

This was enough motivation for me to define a new default_endian context manager and include it with #245. With all those changes I no longer see any test failures or TSan reports when I run the test suite under unittest-ft.

ngoldbaum avatar Oct 29 '25 17:10 ngoldbaum

I had a good call with @ngoldbaum today, and we decided that it would be easiest to remove _set_default_endian() entirely, so we don't have to worry about any module state. This change, as well as declaring free-threaded support are going into bitarray 3.8.0, which I want to release in the next few days.

Also, removing set_default_endian() fixes unittest-ft.

ilanschnell avatar Nov 01 '25 01:11 ilanschnell

I just tagged bitarray 3.8.0. Free-theading (exoerimental) wheels are now available on PyPI!

ilanschnell avatar Nov 02 '25 21:11 ilanschnell