polars icon indicating copy to clipboard operation
polars copied to clipboard

`Series[list].explode()` should not return `None` for empty lists

Open mcrumiller opened this issue 1 year ago • 5 comments

Checks

  • [X] I have checked that this issue has not already been reported.
  • [X] I have confirmed this bug exists on the latest version of Polars.

Reproducible example

import polars as pl

s1 = pl.Series([[1], [], [2]])
s2 = pl.Series([[1], [None], [2]])

s1.explode().equals(s2.explode())
# True

s1.explode()
# shape: (3,)
# Series: '' [i64]
# [
#         1
#         null  <-- this item should not be present
#         2
# ]

Issue description

An empty list does not contain a None value, so explode() should not insert a None.

Expected behavior

shape: (2,)
Series: '' [i64]
[
        1
        2
]

Installed versions

main

mcrumiller avatar Jul 16 '24 12:07 mcrumiller

I'am actully relying on this behaviour in many places. Don't have strong preference how this should be, but I would consider this a breaking change.

Object905 avatar Jul 17 '24 18:07 Object905

I believe this is discussed earlier. Initially it wasn't and it was requested to do so, to make the behavior similar to pandas. I would like to hear what other implementations do first.

ritchie46 avatar Jul 18 '24 09:07 ritchie46

We've discussed this and decided that we want to do a breaking change for this. It makes our explodes more expensive because we have to insert nulls where there previously were none, and in general it's just not correct. The empty list is a perfectly valid object, and exploding it should give zero rows, not a null row.

It was originally introduced for Pandas explode compatibility, but we feel now that that was a mistake.

orlp avatar Aug 28 '24 14:08 orlp

To ease migration burden I would suggest a parameter for this. Maybe at least even instantly deprecated one, so polars-upgrade has something to work with.

Because it's the most scary breaking change in polars yet, at least for me (I use this a lot) 😄

Object905 avatar Aug 28 '24 18:08 Object905

@Object905 You can always get the current behavior of x.explode() with pl.when(x.list.len() == 0).then(pl.lit([None])).otherwise(x).explode(). In fact, you can start transitioning now because that also has the correct behavior today.

If you want the new behavior today you can use x.filter(x.list.len() > 0).explode().

orlp avatar Aug 28 '24 21:08 orlp

Note that the option got added to resolve this in #25289.

coastalwhite avatar Nov 17 '25 10:11 coastalwhite