enh: improve when-then-otherwise to include chaining
We would like to learn about your use case. For example, if this feature is needed to adopt Narwhals in an open source project, could you please enter the link to it below?
No response
Please describe the purpose of the new feature or describe the problem to solve.
As discussed in #588, where a simple when-then-otherwise expression is added, the case for chaining when-then expressions is left to be done at some later point. This will complete and close #47
Suggest a solution if possible.
No response
If you have tried alternatives, please describe them below.
No response
Additional information that may help us understand your needs.
I would be willing to write this code, as looking to complete the open #588 and this is a natural extension
Initial draft proposal can be found at #669
@dangotbanned in your rewrite of when/then i recall you'd said that this might be an easy follow-up - just flagging it in case it interests you, as there's been another request for it (#2597 )
else no worries, i'll get to it before too long
Ah thanks for the ping @MarcoGorelli!
Yeah so I might need to refresh myself a bit, but IIRC the latest version of all backends would support:
pyarrowis pretty hidden (and not annotated) in thepc.computestubs, butpyarrow.compute.case_whenpandasaddedpd.Series.case_whenin2.2.0daskrequirespandas>=2.2.0, but has aCaseWhennode
polarsnatively haspl.whenibis.cases- SQL-based backends I'm sure you know 😉
I'd happily review a PR, but probably won't get around to writing one myself for a while
Edit
In (#2572) the Ternary node is on my todo list - so I'll factor chaining in and may be able to provide an outline for how the narwhals side would work
https://github.com/narwhals-dev/narwhals/blob/51fb46e0504f76044033eea0af43d5bfa2439e13/narwhals/_plan/expr.py#L478-L482
There's also (https://github.com/vega/altair/pull/3427) which could be used for inspiration 🙂
But for narwhals I recommend sticking with what polars does with When, Then, ChainedWhen, ChainedThen - instead of the shortcut I took by using only 3x classes
@MarcoGorelli I've just finished the ExprIR version in (https://github.com/narwhals-dev/narwhals/blob/563076d5e22ce9faccfdce5f99ada865bd13f9db/narwhals/_plan/when_then.py)
The majority of it is direct from the rust version - which is pretty clever and simple.
In particular, the way ChainedThen.otherwise works made me smile 😄
I'm still expecting the "real" version to have more complexity - but this should be helpful whenever someone takes a shot at it 🙌
After having used pyarrow.compute.case_when in (https://github.com/narwhals-dev/narwhals/pull/2598/commits/76373df554356ac3695eb1177dee18177d4fa6a3), I'm gonna recommend not using it here.
Do something like this instead:
from __future__ import annotations
from typing import TYPE_CHECKING
import pyarrow as pa
import pyarrow.compute as pc
if TYPE_CHECKING:
from typing import Sequence
from narwhals._arrow.typing import ArrayOrScalar, ChunkedArrayAny
from narwhals.typing import NonNestedLiteral
def pyarrow_when_then_otherwise_chained(
conditions: Sequence[ChunkedArrayAny],
statements: Sequence[ChunkedArrayAny],
statement: ArrayOrScalar | NonNestedLiteral = None,
):
whens = reversed(conditions)
thens = reversed(statements)
otherwise = (
pa.nulls(len(conditions[-1]), statements[-1].type)
if statement is None
else statement
)
for when in whens:
otherwise = pc.if_else(when, next(thens), otherwise)
return otherwise
That's closer to both what we have already and the generalized version mentioned in (https://github.com/narwhals-dev/narwhals/issues/668#issuecomment-2907968471).
We'd also be able to do something very similar for pandas before pd.Series.case_when was introduced.