cudf icon indicating copy to clipboard operation
cudf copied to clipboard

[FEA] Support Polars `quantile` expression in `group_by` context.

Open taozuoqiao opened this issue 1 year ago • 1 comments

The quantile expression is already supported in LazyFrame namespace and select context, but not supported in group_by context or GroupBy namespace:

import polars as pl

lf = pl.LazyFrame(
    {
        "a": [1, 2, 3, 4],
        "b": [1, 1, 1, 1],
    }
)
print('cpu:')
print(lf.quantile(0.5).collect())
print(lf.select(pl.col('a').quantile(0.5)).collect())
print(lf.group_by('b').agg(pl.col('a').quantile(0.5)).collect())
print(lf.group_by('b').quantile(0.5).collect())

print('gpu:')
with pl.Config() as cfg:
    cfg.set_verbose(True)
    print(lf.quantile(0.5).collect(engine='gpu'))
    print(lf.select(pl.col('a').quantile(0.5)).collect(engine='gpu'))
    print(lf.group_by('b').agg(pl.col('a').quantile(0.5)).collect(engine='gpu'))
    print(lf.group_by('b').quantile(0.5).collect(engine='gpu'))

The output is

cpu:
shape: (1, 2)
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ f64 ┆ f64 │
╞═════╪═════╡
│ 3.0 ┆ 1.0 │
└─────┴─────┘
shape: (1, 1)
┌─────┐
│ a   │
│ --- │
│ f64 │
╞═════╡
│ 3.0 │
└─────┘
shape: (1, 2)
┌─────┬─────┐
│ b   ┆ a   │
│ --- ┆ --- │
│ i64 ┆ f64 │
╞═════╪═════╡
│ 1   ┆ 3.0 │
└─────┴─────┘
shape: (1, 2)
┌─────┬─────┐
│ b   ┆ a   │
│ --- ┆ --- │
│ i64 ┆ f64 │
╞═════╪═════╡
│ 1   ┆ 3.0 │
└─────┴─────┘
gpu:
shape: (1, 2)
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ f64 ┆ f64 │
╞═════╪═════╡
│ 3.0 ┆ 1.0 │
└─────┴─────┘
shape: (1, 1)
┌─────┐
│ a   │
│ --- │
│ f64 │
╞═════╡
│ 3.0 │
└─────┘
/opt/miniconda3_py310/lib/python3.10/site-packages/polars/lazyframe/frame.py:2053: PerformanceWarning: Query execution with GPU not supported, reason: <class 'ValueError'>: too many values to unpack (expected 1)
  return wrap_df(ldf.collect(callback))
keys/aggregates are not partitionable: running default HASH AGGREGATION
shape: (1, 2)
┌─────┬─────┐
│ b   ┆ a   │
│ --- ┆ --- │
│ i64 ┆ f64 │
╞═════╪═════╡
│ 1   ┆ 3.0 │
└─────┴─────┘
keys/aggregates are not partitionable: running default HASH AGGREGATION
shape: (1, 2)
┌─────┬─────┐
│ b   ┆ a   │
│ --- ┆ --- │
│ i64 ┆ f64 │
╞═════╪═════╡
│ 1   ┆ 3.0 │
└─────┴─────┘

taozuoqiao avatar Oct 18 '24 09:10 taozuoqiao

Thanks for reporting! We'll look into this.

vyasr avatar Oct 18 '24 18:10 vyasr