narwhals icon indicating copy to clipboard operation
narwhals copied to clipboard

enh: allow for `.over()`

Open MarcoGorelli opened this issue 10 months ago • 4 comments

We should allow

nw.col('a').sum().over()

e.g. for data {'a': [1,2,3], 'b': [4,5,6]}, df.select(nw.col('a').sum().over(), 'b') should produce {'a': [6, 6, 6], 'b': [4, 5, 6]}

MarcoGorelli avatar Feb 19 '25 14:02 MarcoGorelli

This is not supported in polars either 🤔

import polars as pl

data = {"a": [5, 4, 3, 2, 1]}

pl.DataFrame(data).with_columns(a_max=pl.col("a").max().over())

TypeError: Expr.over() missing 1 required positional argument: 'partition_by'


Unrelated, I just noticed that nw.Expr.over signature is quite different from polars.Expr.over

FBruzzesi avatar Feb 24 '25 08:02 FBruzzesi

thanks - yeah we should align them

and I think we should allow this in Polars too, so that people can write

In [9]: df = pl.DataFrame({'a': [1,1,2], 'b': [4,5,6], 'c': [2, 1, 3]})

In [10]: df.with_columns(d=pl.col('a').cum_sum().over(order_by='c'))
Out[10]:
shape: (3, 4)
┌─────┬─────┬─────┬─────┐
│ a   ┆ b   ┆ c   ┆ d   │
│ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 ┆ i64 │
╞═════╪═════╪═════╪═════╡
│ 1   ┆ 4   ┆ 2   ┆ 2   │
│ 1   ┆ 5   ┆ 1   ┆ 1   │
│ 2   ┆ 6   ┆ 3   ┆ 4   │
└─────┴─────┴─────┴─────┘

(currently, it requires df.with_columns(d=pl.col('a').cum_sum().over(pl.lit(1), order_by='c')))

MarcoGorelli avatar Feb 24 '25 09:02 MarcoGorelli

+1 for aligning the signature of Expr.over to recent polars. I was trying to write a basic window function (e.g. count("a") over (partition by "a" order by "b" asc) and couldn't get a proper order by to work with the existing method

rwhitten577 avatar Mar 11 '25 18:03 rwhitten577

@MarcoGorelli this is currently possible with order dependent ops, but not with general aggregation.

I am not sure which is the sweet spot you want to reach 👀

FBruzzesi avatar Apr 08 '25 15:04 FBruzzesi

this was rejected in Polars, you have to specify at least one of either order_by or partition_by

MarcoGorelli avatar Jun 24 '25 09:06 MarcoGorelli