polars icon indicating copy to clipboard operation
polars copied to clipboard

Add `Expr.scatter`

Open wukan1986 opened this issue 1 year ago • 4 comments

Description

Currently scatter is only available on Series. It would be more useful if it was also available on expressions.

wukan1986 avatar Dec 17 '23 15:12 wukan1986

scatter takes two arguments: the indices, and the new values. I think you're missing the values from your example

going to cc @orlp on this one

MarcoGorelli avatar Dec 17 '23 16:12 MarcoGorelli

may be

pl.full_like(pl.col('A'), fill_value=None).scatter(pl.col('A').arg_not_null(), pl.col('A').drop_nulls())

or

pl.scatter(pl.col('A').arg_not_null(), pl.col('A').drop_nulls())
import polars as pl

df = pl.DataFrame(
    {
        "A": [5, None, 3, 2, 1],
        "B": [5, 3, None, 2, 1],
    }
)

df = df.with_columns([pl.lit(None).alias("nulls")])
print(df)

df = df.with_columns([
    pl.col("nulls").scatter(pl.col('A').arg_not_null(), pl.col('A').drop_nulls()).alias("C"),
])

print(df)

wukan1986 avatar Dec 18 '23 00:12 wukan1986

I think allowing arbitrary expressions with scatter is useful, and shouldn't be too difficult to add. I'd like to keep the current signature though, so Expr.scatter(indices, values) where indices and values are equal length (or values has length one and is being broadcasted).

orlp avatar Dec 18 '23 01:12 orlp

@wukan1986 I edited the original post to just focus on the feature request instead of the specific example.

orlp avatar Dec 18 '23 01:12 orlp