polars
polars copied to clipboard
`Series[list].explode()` should not return `None` for empty lists
Checks
- [X] I have checked that this issue has not already been reported.
- [X] I have confirmed this bug exists on the latest version of Polars.
Reproducible example
import polars as pl
s1 = pl.Series([[1], [], [2]])
s2 = pl.Series([[1], [None], [2]])
s1.explode().equals(s2.explode())
# True
s1.explode()
# shape: (3,)
# Series: '' [i64]
# [
# 1
# null <-- this item should not be present
# 2
# ]
Issue description
An empty list does not contain a None value, so explode() should not insert a None.
Expected behavior
shape: (2,)
Series: '' [i64]
[
1
2
]
Installed versions
main
I'am actully relying on this behaviour in many places. Don't have strong preference how this should be, but I would consider this a breaking change.
I believe this is discussed earlier. Initially it wasn't and it was requested to do so, to make the behavior similar to pandas. I would like to hear what other implementations do first.
We've discussed this and decided that we want to do a breaking change for this. It makes our explodes more expensive because we have to insert nulls where there previously were none, and in general it's just not correct. The empty list is a perfectly valid object, and exploding it should give zero rows, not a null row.
It was originally introduced for Pandas explode compatibility, but we feel now that that was a mistake.
To ease migration burden I would suggest a parameter for this. Maybe at least even instantly deprecated one, so polars-upgrade has something to work with.
Because it's the most scary breaking change in polars yet, at least for me (I use this a lot) 😄
@Object905 You can always get the current behavior of x.explode() with pl.when(x.list.len() == 0).then(pl.lit([None])).otherwise(x).explode(). In fact, you can start transitioning now because that also has the correct behavior today.
If you want the new behavior today you can use x.filter(x.list.len() > 0).explode().
Note that the option got added to resolve this in #25289.