buckaroo Add column grouping

Add column grouping

Open paddymul opened this issue 1 year ago • 1 comments

I have tried a couple of times to add column ordering functionality that orders columns by "interestingness", and putting interesting columns to the left. A simple measure of "interestingness" is a column that is entirely boring would be only NaNs, second most boring would be a constant value.

I really want to note columns that always vary together. So Citibike data has 4 columns for start station - start station id,start station name, start station latitude, start station longitude . Every row with start station id = 2002 will have the exact same start station latitude . Given this, I only want to show one of the 4 columns (preferable start station name) to the left and rank the other 3 columns as boring. I would also like to highlight the related columns to convey this to users.

pasted from the polars discord.

Oct 27 '23 12:10 paddymul

dsds from abstractqqq does some of this. It is tied to polars. https://github.com/abstractqqq/dsds

dsds-col-grouping-1 dsds-col-grouping2

Oct 27 '23 12:10 paddymul

buckaroo buckaroo copied to clipboard

Add column grouping

buckaroo
buckaroo copied to clipboard