cudf
cudf copied to clipboard
[BUG]: Dtype mismatch between partitioned and non-partitioned aggregation with experimental streaming executor in some aggregations
Describe the bug
This snippet produces a result with different dtypes using the streaming executor, depending on whether there's more than one partition.
Steps/Code to reproduce bug
import polars as pl
df = pl.LazyFrame({"a": [-10, 4, 5, 2, 3, 6, 8, 9, 4, 4, 5, 2, 3, 7, 3, 6, -10, -11]})
q = df.select(pl.col("a").n_unique())
print("cpu :", q.collect().dtypes)
print("gpu-single :", q.collect(engine=pl.GPUEngine(executor="streaming")).dtypes)
print("gpu-partitioned:", q.collect(engine=pl.GPUEngine(executor="streaming", executor_options={"max_rows_per_partition": 9})).dtypes)
Expected behavior
All outputs should match
Additional context
This is possibly a duplicate of https://github.com/rapidsai/cudf/issues/15852. But I'm opening a separate issue because I'm surprised to see a difference in the streaming exeuctor's output depending on whether theres one or multiple partitions.