pyjanitor
pyjanitor copied to clipboard
Make clean_names() compatible with polars and geopandas dataframes
Brief Description
I would like to propose for the clean_names() method to work with polars dataframe and geopandas dataframe. I work in the climate domain and also came from R. Polars has become a default choice to work with super large tabular climate datasets thanks to its implicit parallelism and efficient memory management. geopandas is a default go-to for working with GIS data. Would be great to see clean_names() method be compatible with both.
Example API
in polars, would look like....
polars.clean_names()
in geo pandas would look like...
gdf.clean_names()
@ericmjl @pyjanitor-devs/core-devs thoughts? +1 if we can extend pyjanitor to cover polars
We should do it! Although the only thing I'm not sure of is how to extend Polars. Is there a guide?
Will admit I haven't had the time to do this because of a lack of need to use polars, but I know the dataframe user base will want polars integration for sure. If we make it work for clean_names, it should just work for the rest of the functions!
we can start incrementally. @3SMMZRjWgS would you like to submit a PR that extends clean_names to polars and geo-pandas?
pypolars page on api extensions : https://docs.pola.rs/py-polars/html/reference/api.html
@3SMMZRjWgS implemented make_clean_names that can be used within existing polars functions - have a look at this PR - #1351 . @ericmjl I chose to use functions instead of creating a method on the polars namespace. This way users can just pass the function to existing polars functions (polars chaining options is pretty extensive already) - this way there should be nothing new to learn for polars users, just plug in the function to the appropriate function and voila. thoughts?
@3SMMZRjWgS implemented
make_clean_namesthat can be used within existing polars functions - have a look at this PR - #1351 . @ericmjl I chose to use functions instead of creating a method on the polars namespace. This way users can just pass the function to existing polars functions (polars chaining options is pretty extensive already) - this way there should be nothing new to learn for polars users, just plug in the function to the appropriate function and voila. thoughts?
Thank you so much for making it happen @samukweku and @ericmjl! A function would do 👍 . Look forward to pushing to conda-forge.