polars icon indicating copy to clipboard operation
polars copied to clipboard

Expand `List` & `Array` to columns

Open Julian-J-S opened this issue 9 months ago • 0 comments

Description

It is currently possible to convert List & Array types to columns but it feels awkward (not ideal api) to use a struct.

Imo it would make a better api / use experience to expand a List/Array directly.

Example (current way)

df = pl.DataFrame({"x": [[1, 2, 3], [4, 5, 6]]})
df
# shape: (2, 1)
# ┌───────────┐
# │ x         │
# │ ---       │
# │ list[i64] │
# ╞═══════════╡
# │ [1, 2, 3] │
# │ [4, 5, 6] │
# └───────────┘


df.with_columns(
    pl.col("x").list.to_struct(),
).unnest("x")
# shape: (2, 3)
# ┌─────────┬─────────┬─────────┐
# │ field_0 ┆ field_1 ┆ field_2 │
# │ ---     ┆ ---     ┆ ---     │
# │ i64     ┆ i64     ┆ i64     │
# ╞═════════╪═════════╪═════════╡
# │ 1       ┆ 2       ┆ 3       │
# │ 4       ┆ 5       ┆ 6       │
# └─────────┴─────────┴─────────┘

Solution

expr.expand (not possible with current polars design)

  • the ideal solution imo would be a .expand method in the list/arr namespace
  • afaik this is not possible with polars currently because expressions can only evaluate to a single column?
df.with_columns(
    pl.col("x").list.expand(),
)

df.expand

  • like df.unnest but for List/Array instead of Struct
df.expand("x")

Julian-J-S avatar Apr 25 '24 16:04 Julian-J-S