polars icon indicating copy to clipboard operation
polars copied to clipboard

`next()` on GroupBy raises `AttributeError` object has no attribute `_current_index`

Open cmdlineluser opened this issue 2 years ago • 3 comments

Checks

  • [X] I have checked that this issue has not already been reported.

  • [X] I have confirmed this bug exists on the latest version of Polars.

Reproducible example

import polars as pl

next(pl.DataFrame().group_by(1))
# AttributeError: 'GroupBy' object has no attribute '_current_index'

Log output

No response

Issue description

Not sure if next() is intended to work or not, it seems like it should raise a TypeError instead if it isn't.

Expected behavior

"Work" or return a TypeError?

next([])
# TypeError: 'list' object is not an iterator

Installed versions

--------Version info---------
Polars:               0.19.19
Index type:           UInt32
Platform:             macOS-13.6.1-arm64-arm-64bit
Python:               3.11.6 (main, Nov  2 2023, 04:39:40) [Clang 14.0.0 (clang-1400.0.29.202)]

----Optional dependencies----
adbc_driver_manager:  <not installed>
cloudpickle:          <not installed>
connectorx:           <not installed>
deltalake:            <not installed>
fsspec:               2023.6.0
gevent:               <not installed>
matplotlib:           <not installed>
numpy:                1.26.2
openpyxl:             <not installed>
pandas:               2.0.3
pyarrow:              12.0.1
pydantic:             <not installed>
pyiceberg:            <not installed>
pyxlsb:               <not installed>
sqlalchemy:           <not installed>
xlsx2csv:             <not installed>
xlsxwriter:           <not installed>

cmdlineluser avatar Dec 02 '23 21:12 cmdlineluser

https://github.com/pola-rs/polars/blob/main/py-polars/polars/dataframe/group_by.py

"just" needs this treatment

deanm0000 avatar Jan 10 '24 14:01 deanm0000

Yeah, the machinery is there I think.

_current_index is created inside __iter__

https://github.com/pola-rs/polars/blob/e9a95b74cee01f533b90bdf72ac8a021d0d3fcc3/py-polars/polars/dataframe/group_by.py#L114

So it works if you manually call iter()

df = pl.DataFrame({"a": [1, 1, 2], "b": [3, 4, 5]})

next(iter(df.group_by("a")))
# (1,
#  shape: (2, 2)
#  ┌─────┬─────┐
#  │ a   ┆ b   │
#  │ --- ┆ --- │
#  │ i64 ┆ i64 │
#  ╞═════╪═════╡
#  │ 1   ┆ 3   │
#  │ 1   ┆ 4   │
#  └─────┴─────┘)

cmdlineluser avatar Jan 10 '24 14:01 cmdlineluser

I think it needs to be in __init__ or else if in __next__ it needs to see if it exists and if not create it

deanm0000 avatar Jan 10 '24 15:01 deanm0000

I can try to fix it

s-bidowaniec avatar May 04 '24 18:05 s-bidowaniec

This should raise a TypeError.

stinodego avatar May 26 '24 17:05 stinodego