narwhals icon indicating copy to clipboard operation
narwhals copied to clipboard

enh: improve when-then-otherwise to include chaining

Open aivanoved opened this issue 1 year ago • 4 comments

We would like to learn about your use case. For example, if this feature is needed to adopt Narwhals in an open source project, could you please enter the link to it below?

No response

Please describe the purpose of the new feature or describe the problem to solve.

As discussed in #588, where a simple when-then-otherwise expression is added, the case for chaining when-then expressions is left to be done at some later point. This will complete and close #47

Suggest a solution if possible.

No response

If you have tried alternatives, please describe them below.

No response

Additional information that may help us understand your needs.

I would be willing to write this code, as looking to complete the open #588 and this is a natural extension

Initial draft proposal can be found at #669

aivanoved avatar Jul 29 '24 13:07 aivanoved

@dangotbanned in your rewrite of when/then i recall you'd said that this might be an easy follow-up - just flagging it in case it interests you, as there's been another request for it (#2597 )

else no worries, i'll get to it before too long

MarcoGorelli avatar May 23 '25 10:05 MarcoGorelli

Ah thanks for the ping @MarcoGorelli!

Yeah so I might need to refresh myself a bit, but IIRC the latest version of all backends would support:

I'd happily review a PR, but probably won't get around to writing one myself for a while

Edit

In (#2572) the Ternary node is on my todo list - so I'll factor chaining in and may be able to provide an outline for how the narwhals side would work

https://github.com/narwhals-dev/narwhals/blob/51fb46e0504f76044033eea0af43d5bfa2439e13/narwhals/_plan/expr.py#L478-L482

There's also (https://github.com/vega/altair/pull/3427) which could be used for inspiration 🙂

But for narwhals I recommend sticking with what polars does with When, Then, ChainedWhen, ChainedThen - instead of the shortcut I took by using only 3x classes

dangotbanned avatar May 23 '25 11:05 dangotbanned

@MarcoGorelli I've just finished the ExprIR version in (https://github.com/narwhals-dev/narwhals/blob/563076d5e22ce9faccfdce5f99ada865bd13f9db/narwhals/_plan/when_then.py)

The majority of it is direct from the rust version - which is pretty clever and simple. In particular, the way ChainedThen.otherwise works made me smile 😄

I'm still expecting the "real" version to have more complexity - but this should be helpful whenever someone takes a shot at it 🙌

dangotbanned avatar May 25 '25 17:05 dangotbanned

After having used pyarrow.compute.case_when in (https://github.com/narwhals-dev/narwhals/pull/2598/commits/76373df554356ac3695eb1177dee18177d4fa6a3), I'm gonna recommend not using it here.

Do something like this instead:

from __future__ import annotations

from typing import TYPE_CHECKING

import pyarrow as pa
import pyarrow.compute as pc

if TYPE_CHECKING:
    from typing import Sequence

    from narwhals._arrow.typing import ArrayOrScalar, ChunkedArrayAny
    from narwhals.typing import NonNestedLiteral


def pyarrow_when_then_otherwise_chained(
    conditions: Sequence[ChunkedArrayAny],
    statements: Sequence[ChunkedArrayAny],
    statement: ArrayOrScalar | NonNestedLiteral = None,
):
    whens = reversed(conditions)
    thens = reversed(statements)
    otherwise = (
        pa.nulls(len(conditions[-1]), statements[-1].type)
        if statement is None
        else statement
    )
    for when in whens:
        otherwise = pc.if_else(when, next(thens), otherwise)
    return otherwise

That's closer to both what we have already and the generalized version mentioned in (https://github.com/narwhals-dev/narwhals/issues/668#issuecomment-2907968471).

We'd also be able to do something very similar for pandas before pd.Series.case_when was introduced.

dangotbanned avatar May 28 '25 10:05 dangotbanned