narwhals icon indicating copy to clipboard operation
narwhals copied to clipboard

[Enh]: Add `list.get()` negative indices and out-of-bounds access

Open skritsotalakis opened this issue 4 months ago • 7 comments

We would like to learn about your use case. For example, if this feature is needed to adopt Narwhals in an open source project, could you please enter the link to it below?

Looking at modern Polars Chapter 2.2 (https://kevinheavey.github.io/modern-polars/method_chaining.html#:~:text=extract_city_name_pl) for #1945, I noticed that narwhals is missing a method similar to polars expr.list.get() (https://docs.pola.rs/api/python/dev/reference/expressions/api/polars.Expr.list.get.html).

Please describe the purpose of the new feature or describe the problem to solve.

This feature allows users to get values by index in a sublist (expr.list).

Suggest a solution if possible.

No response

If you have tried alternatives, please describe them below.

No response

Additional information that may help us understand your needs.

Example with data from modern Polars:


import polars

data = [
    {"OriginCityName": "Raleigh/Durham, NC", "DestCityName": "Chicago, IL"},
    {"OriginCityName": "Seattle, WA", "DestCityName": "Chicago, IL"},
    {"OriginCityName": "Indianapolis, IN", "DestCityName": "Dallas/Fort Worth, TX"},
    {"OriginCityName": "Sarasota/Bradenton, FL", "DestCityName": "Newark, NJ"},
    {"OriginCityName": "Chicago, IL", "DestCityName": "Moline, IL"},
    {"OriginCityName": "Madison, WI", "DestCityName": "Washington, DC"},
    {"OriginCityName": "West Palm Beach/Palm Beach, FL", "DestCityName": "Charlotte, NC"}
]

df_pl = pl.DataFrame(data)
In [26]: df_pl
Out[26]: 
shape: (7, 2)
┌────────────────────────────────┬───────────────────────┐
│ OriginCityName                 ┆ DestCityName          │
│ ---                            ┆ ---                   │
│ str                            ┆ str                   │
╞════════════════════════════════╪═══════════════════════╡
│ Raleigh/Durham, NC             ┆ Chicago, IL           │
│ Seattle, WA                    ┆ Chicago, IL           │
│ Indianapolis, IN               ┆ Dallas/Fort Worth, TX │
│ Sarasota/Bradenton, FL         ┆ Newark, NJ            │
│ Chicago, IL                    ┆ Moline, IL            │
│ Madison, WI                    ┆ Washington, DC        │
│ West Palm Beach/Palm Beach, FL ┆ Charlotte, NC         │
└────────────────────────────────┴───────────────────────┘
cols = ["OriginCityName", "DestCityName"]
df = df_pl.select(pl.col(cols).str.split(",").list.get(0))
In [28]: print(df)
shape: (7, 2)
┌────────────────────────────┬───────────────────┐
│ OriginCityName             ┆ DestCityName      │
│ ---                        ┆ ---               │
│ str                        ┆ str               │
╞════════════════════════════╪═══════════════════╡
│ Raleigh/Durham             ┆ Chicago           │
│ Seattle                    ┆ Chicago           │
│ Indianapolis               ┆ Dallas/Fort Worth │
│ Sarasota/Bradenton         ┆ Newark            │
│ Chicago                    ┆ Moline            │
│ Madison                    ┆ Washington        │
│ West Palm Beach/Palm Beach ┆ Charlotte         │
└────────────────────────────┴───────────────────┘

skritsotalakis avatar Jul 28 '25 11:07 skritsotalakis

nice one - yeah, looks in-scope!

MarcoGorelli avatar Jul 28 '25 12:07 MarcoGorelli

do you want to try to contribute it?

MarcoGorelli avatar Jul 28 '25 12:07 MarcoGorelli

Yes! I can try do it as a modern Polars side quest :)

skritsotalakis avatar Jul 28 '25 12:07 skritsotalakis

This would be for Series.list.get as well I assume?

dangotbanned avatar Jul 28 '25 14:07 dangotbanned

This would be for Series.list.get as well I assume?

Yes. I will change the title to reflect this.

skritsotalakis avatar Jul 28 '25 15:07 skritsotalakis

Adding as a reminder.

TODO:

  • [ ] Allow negative indexes (see comment)
  • [ ] [Optional] Handle out of bound index - polars has even an option null_on_oob to determine if it should raise or return null

FBruzzesi avatar Aug 14 '25 11:08 FBruzzesi

[ ] Allow negative indexes

I think a step towards this could be factoring-out the pyarrow handling for this on ArrowSeries

As long as you have the size, it is do-able

dangotbanned avatar Aug 14 '25 11:08 dangotbanned