great-tables icon indicating copy to clipboard operation
great-tables copied to clipboard

Value based styling for pivoted tables?

Open igorcalabria opened this issue 1 year ago • 6 comments

Question

What would you like to know?

Hi, not sure if I've missed something but is there an easy way to apply conditional styles (https://posit-dev.github.io/great-tables/get-started/basic-styling.html) to pivoted tables (ie. Dates as columns)?

I know it's possible to pass a polars column selector in the location body, but I haven´t found a proper way to select rows per column automatically.

Something like this

 tab_style(
        style=style.text(color="white"),
        locations=loc.body(
            columns=cs.numeric(),
            rows=pl.col("each-column-from-the-selector").eq(0)
        )
    )

The alternative is applying styles on individual columns but that's a bit cumbersome when columns are dynamic

igorcalabria avatar Jul 01 '24 12:07 igorcalabria

@igorcalabria If I understand correctly, your idea should work as expected.

import polars as pl
from great_tables import GT, loc, style
from great_tables.data import gtcars
from polars import selectors as cs

(
    GT(pl.from_pandas(gtcars).head())
    .tab_style(
        style=style.text(color="red"),
        locations=loc.body(
            columns=cs.numeric(),
            rows=pl.col("year").eq(2017.0)
        )
    )
)

image

jrycw avatar Jul 01 '24 18:07 jrycw

Oh, I think I understand what you're looking for now. Maybe something like this:

import polars as pl
from great_tables import GT, loc, style
from great_tables.data import gtcars
from polars import selectors as cs

df_mini = pl.from_pandas(gtcars).head()
gt = GT(df_mini)


for column in df_mini.select(cs.numeric()).columns:
    gt = gt.tab_style(
        style=style.text(color="red"),
        locations=loc.body(columns=column, rows=pl.col(column).gt(600)),
    )

gt

image

jrycw avatar Jul 01 '24 19:07 jrycw

Yeah that's pretty much it. A closer example is something like this

import polars as pl
from great_tables import GT, loc, style
from great_tables.data import gtcars
from polars import selectors as cs

df_mini = pl.from_pandas(gtcars).head().pivot(on="year", index=["trim", "mfr"], values="hp")
gt = GT(df_mini,rowname_col="trim", groupname_col="mfr")
for dt in df_mini.select(cs.numeric()).columns:
    gt = gt.tab_style(
        style=style.text(color="red"),
        locations=loc.body(columns=dt, rows=pl.col(dt).gt(600))
    )
gt

image

My question was if there's a more idiomatic way of doing it. I feel like this kind of pivots on dates are super common (there's one in the examples page).

From a user's point of view, I think the most natural API would be to simply style stuff based based on cell values and not the entire series:

    gt = gt.tab_style(
        style=style.text(color="red"),
        locations=loc.body(columns=cs.numeric(), rows=pl.col("cell_value").gt(600))
    )

I understand that this may be super awkward to implement across different dataframe implementations and that pl.col("cell-value") doesn't make much sense from polars point of view so It can get confusing too

EDIT:

Maybe

    gt = gt.tab_style(
        style=style.text(color="red"),
       # if rows is a lambda expr, call it for each column that matches the selector
        locations=loc.body(columns=cs.numeric(), rows=lambda col: pl.col(col).gt(600))
    )

igorcalabria avatar Jul 01 '24 19:07 igorcalabria

Thanks for your feedback. Personally, I'm comfortable calling multiple tab_style() to achieve the desired result, as it seems to be the most idiomatic approach I'm aware of.

jrycw avatar Jul 02 '24 00:07 jrycw

Hey thanks for flagging this, and for the useful examples. @rich-iannone and I have talked before about potentially adding a loc.body(mask=...) argument, that could handle the multiple cols selected -> those cols transformed situation.

Something like this...

import polars as pl
from great_tables import GT, loc, style
from great_tables.data import gtcars
from polars import selectors as cs

df_mini = pl.from_pandas(gtcars).head().pivot(on="year", index=["trim", "mfr"], values="hp")
gt = (
    GT(df_mini,rowname_col="trim", groupname_col="mfr")
    .tab_style(
        style=style.text(color="red"),
        locations=loc.body(mask = cs.numeric().gt(600))
    )
)

Would this cover your usecase?

machow avatar Jul 02 '24 12:07 machow

Yes, that's exactly what I needed. It's also way better than my suggestion, makes a lot of sense to use masks for styling and it fits pretty well with polars(and probably other dataframe) APIs

igorcalabria avatar Jul 02 '24 13:07 igorcalabria

FWIW I just answered a stack overflow question asking this and landed on a sliiiightly simpler current solution than what is presented here. Given that style=style.text(color="red") does not change within the for loop, a list comprehension passed to locations simplifies things a tad. It also avoids repeatedly overriding the gt variable

import polars as pl
from great_tables import GT, loc, style
from great_tables.data import gtcars
from polars import selectors as cs

df_mini = pl.from_pandas(gtcars).head().pivot(on="year", index=["trim", "mfr"], values="hp")

(
    GT(df_mini, rowname_col="trim", groupname_col="mfr")
    .tab_style(
        style=style.text(color="red"),
        locations=[
            loc.body(columns=dt, rows=pl.col(dt).gt(600))
            for dt in df_mini.select(cs.numeric()).columns
        ],
    )
)

Image

There is another stack overflow question asking the same thing as well, so some demand from users does look be there. The loc.body(mask=...) solution does look like it would cover both of these SO questions (and is intuitive to me too).

henryharbeck avatar Dec 14 '24 11:12 henryharbeck

@henryharbeck, thanks for mentioning the workaround; it's quite clever.

jrycw avatar Dec 15 '24 06:12 jrycw

@igorcalabria, thanks for bringing this feature request to us, and apologies for the delay in implementing it. The feature has just been merged, and I personally look forward to its release!

jrycw avatar Jan 23 '25 02:01 jrycw