Tom Augspurger
Tom Augspurger
There's just one remaining error, which I'm not able to reproduce locally: ``` dask/dataframe/dask_expr/tests/test_reductions.py::test_value_counts_sort: AssertionError: Series.index are different Series.index values are different (53.33333 %) [left]: Index([1, 2, 3, 5, 4,...
I think this was closed by #12191
> The problem is that we only produce empty channels when we are running in distributed mode and the number of ranks is larger than the number of chunks If...
OK, I think that leaves us with multiple ranks on a single GPU (doable, but doesn't nicely integrate with the rest of our CI setup) and somehow injecting a empty...
I suspect this falls under the "pandas compatibility note" at the bottom of https://docs.rapids.ai/api/cudf/stable/user_guide/api_docs/api/cudf.merge/ > DataFrames merges in cuDF result in non-deterministic row ordering.
Ah, my apologies for not reading closely enough.
Thanks for the report. We could perhaps put https://github.com/dask/dask/blob/3801bedc7c71c83f37e836af71f740974c0434b3/dask/dataframe/dask_expr/_reductions.py#L1177 inside a `with np.errstate`. I'm not sure it'd be worth the performance cost, and I'm not sure whether there are warnings...
Thanks for the report. Looking through the [changelog](https://github.com/dask/dask/releases/tag/2025.5.0), https://github.com/dask/dask/pull/11945 and https://github.com/dask/dask/pull/11935 strike me as possible candidates, but it's hard to say for sure without bisecting.
Added https://github.com/rapidsai/cudf/issues/15852 as a sub-task, since we'll have a behavior change (from not matching polars to matching polars). We'll at least need a decision there on whether we want to...
We also hit some pickling issues in https://github.com/rapidsai/rapids-dask-dependency/pull/132. Possibly from the multiprocessing mode change (https://docs.python.org/3/whatsnew/3.14.html#multiprocessing), though I didn't investigate things.