modin icon indicating copy to clipboard operation
modin copied to clipboard

BUG: `groupby()` and `max()` lead to inconsistencies with Pandas

Open asddfl opened this issue 1 month ago • 0 comments

Modin version checks

  • [x] I have checked that this issue has not already been reported.

  • [x] I have confirmed this bug exists on the latest released version of Modin.

  • [x] I have confirmed this bug exists on the main branch of Modin. (In order to do this you can follow this guide.)

Reproducible Example

import os
import pandas as pd
os.environ["MODIN_ENGINE"] = "ray"
import modin.pandas as md
import numpy as np

pd_t1 = pd.DataFrame(
    {
        'c0': [0],
        'c1': [1]
    }
)
md_t1 = md.DataFrame(
    {
        'c0': [0],
        'c1': [1]
    }
)

print("Pandas:")
result = pd_t1.assign().groupby(['c0', 'c1'])['c0'].max()
print(result)

print("Modin ray:")
result = md_t1.assign().groupby(['c0', 'c1'])['c0'].max()
print(result)
Pandas:
c0  c1
0   1     0
Name: c0, dtype: int64
Modin ray:
Series([], Name: (0, 1), dtype: float64)

Issue Description

groupby() and max() lead to inconsistencies between Modin and Pandas. Both Modin's ray and dask engines may appear in the above situation.

Expected Behavior

Modin's results are the same with Pandas's.

Error Logs


Installed Versions

INSTALLED VERSIONS

commit : bce3707443d525cdca5c50f9c5e65f2f82fcc882 python : 3.10.19 python-bits : 64 OS : Linux OS-release : 6.14.0-35-generic Version : #35~24.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue Oct 14 13:55:17 UTC 2 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8

Modin dependencies

modin : 0.37.1 ray : 2.51.1 dask : 2025.11.0 distributed : 2025.11.0

pandas dependencies

pandas : 2.3.3 numpy : 1.26.4 pytz : 2025.2 dateutil : 2.9.0.post0 pip : 25.3 Cython : None sphinx : None IPython : 8.27.0 adbc-driver-postgresql: None adbc-driver-sqlite : None bs4 : 4.14.2 blosc : None bottleneck : None dataframe-api-compat : None fastparquet : None fsspec : 2025.10.0 html5lib : None hypothesis : None gcsfs : None jinja2 : 3.1.6 lxml.etree : None matplotlib : 3.10.7 numba : 0.61.2 numexpr : None odfpy : None openpyxl : None pandas_gbq : None psycopg2 : None pymysql : None pyarrow : 22.0.0 pyreadstat : None pytest : None python-calamine : None pyxlsb : None s3fs : None scipy : 1.15.3 sqlalchemy : 2.0.44 tables : None tabulate : 0.9.0 xarray : 2025.6.1 xlrd : None xlsxwriter : None zstandard : 0.25.0 tzdata : 2025.2 qtpy : None pyqt5 : None

asddfl avatar Dec 02 '25 16:12 asddfl