Race condition in `align`?
What happened?
While stress-testing a personal xarray-based project with pytest-run-parallel, which runs my test suite many times in parallel over different threads, I'm frequently getting a race condition: https://github.com/crusaderky/pathfinder2e_stats/actions/runs/19407113819/job/55523433199
pathfinder2e_stats/damage.py:381: in damage
_, persistent_damage_DC = xarray.align(
.pixi/envs/nogil/lib/python3.14t/site-packages/xarray/structure/alignment.py:967: in align
aligner.align()
.pixi/envs/nogil/lib/python3.14t/site-packages/xarray/structure/alignment.py:667: in align
self.reindex_all()
.pixi/envs/nogil/lib/python3.14t/site-packages/xarray/structure/alignment.py:638: in reindex_all
self.results = tuple(
.pixi/envs/nogil/lib/python3.14t/site-packages/xarray/structure/alignment.py:625: in _reindex_one
dim_pos_indexers = self._get_dim_pos_indexers(matching_indexes)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.pixi/envs/nogil/lib/python3.14t/site-packages/xarray/structure/alignment.py:556: in _get_dim_pos_indexers
indexers = obj_idx.reindex_like(aligned_idx, **self.reindex_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = PandasIndex(Index(['bleed', 'fire', 'electricity'], dtype='str', name='damage_type'))
other = PandasIndex(Index(['bleed', 'cold', 'fire'], dtype='str', name='damage_type'))
method = None, tolerance = None
def reindex_like(
self, other: Self, method=None, tolerance=None
) -> dict[Hashable, Any]:
if not self.index.is_unique:
> raise ValueError(
f"cannot reindex or align along dimension {self.dim!r} because the "
"(pandas) index has duplicate values"
)
E ValueError: cannot reindex or align along dimension 'damage_type' because the (pandas) index has duplicate values
Sadly, I cannot reproduce the failure locally - for some reason it happens only in CI, even if the whole stack and platform are identical between CI and my local box. For this reason, I'm unsure if the issue is in xarray or in pandas.
The test that is failing in my project is calling
_, b2 = xarray.align(a, b, align="left")
where a is a thread-local Dataset and b is a DataArray that is defined in the global scope and is shared among all the threads that run xarray.align in parallel. The two objects are aligned along a string index, with object dtype in a and dtype=<U11 in b. All inputs are deterministic.
Minimal reproducer (which I'm however failing to make it demonstrate the issue as explained above):
import xarray
from numpy.testing._private.utils import run_threaded
# b is shared among all threads
b = xarray.DataArray([4, 5, 6], dims=["x"], coords={"x": ["b", "f", "c"]})
def f():
# a is thread-local
a = xarray.Dataset(coords={"x": ["b", "f", "e"]})
a.coords["x"] = a.coords["x"].astype(object)
_, b2 = xarray.align(a, b, join="left", fill_value=-1)
run_threaded(f)
My gut feeling is that there is a brief moment where the input DataArray b is temporarily updated in place. However, I've audited the xarray code and did not spot anything untowards.
From what I understand:
-
aligncallsAligner.reindex_all, - which calls
Aligner._reindex_one, - which calls
DataArray._reindex_callbackonb, - which calls
DataArray._to_temp_dataset->Dataset._reindex_callback->DataArray.from_temp_dataset
Notably, the Variable instances in Dataset._reindex_callback are the same objects in the shared b object.
Environment
- python-freethreading 3.14.0
- pandas 3.0.0.dev0+2714.gfa5b90a079
- xarray 2025.10.1
- ubuntu-latest github actions CI runners
I'm inclined to blame pandas (https://github.com/pandas-dev/pandas/issues/2728, https://github.com/pydata/xarray/issues/9836) and I suspect there's at least one shallow-copy somewhere in that align code path.
One way to check would be to write a dummy custom index class with no Pandas involved.
My comments on that issue are coming from xarray usage. If I remember correctly the examples are the functions which xarray calls