pyjanitor icon indicating copy to clipboard operation
pyjanitor copied to clipboard

Make clean_names() compatible with polars and geopandas dataframes

Open 3SMMZRjWgS opened this issue 1 year ago • 6 comments

Brief Description

I would like to propose for the clean_names() method to work with polars dataframe and geopandas dataframe. I work in the climate domain and also came from R. Polars has become a default choice to work with super large tabular climate datasets thanks to its implicit parallelism and efficient memory management. geopandas is a default go-to for working with GIS data. Would be great to see clean_names() method be compatible with both.

Example API

in polars, would look like.... polars.clean_names()

in geo pandas would look like... gdf.clean_names()

3SMMZRjWgS avatar Mar 05 '24 07:03 3SMMZRjWgS

@ericmjl @pyjanitor-devs/core-devs thoughts? +1 if we can extend pyjanitor to cover polars

samukweku avatar Mar 17 '24 20:03 samukweku

We should do it! Although the only thing I'm not sure of is how to extend Polars. Is there a guide?

Will admit I haven't had the time to do this because of a lack of need to use polars, but I know the dataframe user base will want polars integration for sure. If we make it work for clean_names, it should just work for the rest of the functions!

ericmjl avatar Mar 17 '24 21:03 ericmjl

we can start incrementally. @3SMMZRjWgS would you like to submit a PR that extends clean_names to polars and geo-pandas?

samukweku avatar Mar 17 '24 22:03 samukweku

pypolars page on api extensions : https://docs.pola.rs/py-polars/html/reference/api.html

samukweku avatar Apr 06 '24 10:04 samukweku

@3SMMZRjWgS implemented make_clean_names that can be used within existing polars functions - have a look at this PR - #1351 . @ericmjl I chose to use functions instead of creating a method on the polars namespace. This way users can just pass the function to existing polars functions (polars chaining options is pretty extensive already) - this way there should be nothing new to learn for polars users, just plug in the function to the appropriate function and voila. thoughts?

samukweku avatar Apr 20 '24 11:04 samukweku

@3SMMZRjWgS implemented make_clean_names that can be used within existing polars functions - have a look at this PR - #1351 . @ericmjl I chose to use functions instead of creating a method on the polars namespace. This way users can just pass the function to existing polars functions (polars chaining options is pretty extensive already) - this way there should be nothing new to learn for polars users, just plug in the function to the appropriate function and voila. thoughts?

Thank you so much for making it happen @samukweku and @ericmjl! A function would do 👍 . Look forward to pushing to conda-forge.

3SMMZRjWgS avatar Apr 21 '24 19:04 3SMMZRjWgS