imbalanced-learn icon indicating copy to clipboard operation
imbalanced-learn copied to clipboard

Native support for `polars` dataframes in imbalanced-learn

Open kumar-abhishek opened this issue 1 year ago • 6 comments

Polars is a high-performance DataFrame library for Python, celebrated for its fast data processing capabilities and efficient, concise syntax. Its multi-threaded query engine and strong integration with the Python ecosystem make it an outstanding choice for managing large datasets. Polars has been gaining popularity as a fast and memory-efficient alternative to pandas, especially for big data applications.

While several libraries like scikit-learn and seaborn have added support for Polars DataFrames, I am not sure if imbalanced-learn currently requires users to directly use polars dataFrames or they need to convert Polars DataFrames to pandas (e.g., polars_df.to_pandas()) before applying the sampling methods.

I do see that imblearn depends upon polars and some APIs like set_output accept polars as parameter, it's unclear if imblearn APIs can directly work with polars dataframes

kumar-abhishek avatar Sep 06 '24 21:09 kumar-abhishek

Native support for polars would be really great!

jamblejoe avatar Dec 12 '24 09:12 jamblejoe

Maybe someone could point out the necessary steps to get at least rudimentary functionality with polars dataframes. I would be willing to investigate implementing those. Thanks!

jamblejoe avatar Dec 12 '24 09:12 jamblejoe

It is on my todo list for 2025

chkoar avatar Dec 12 '24 09:12 chkoar

I second this, native support for polars would be great

niccolopetti avatar Dec 17 '24 01:12 niccolopetti

I'd encourage you to consider https://github.com/narwhals-dev/narwhals

Zethson avatar Jun 24 '25 10:06 Zethson

In case someone else is looking to use a custom sampler, FunctionSampler(..., validate=False) allows any data type as described in its docs. This allows polars inputs, but obviously won't validate them as it does polars/numpy

foster999 avatar Nov 19 '25 16:11 foster999