gt
gt copied to clipboard
Indicating rows by number is inconsistent on grouped tables
Prework
- [x] Read and agree to the code of conduct and contributing guidelines.
- [x] If there is already a relevant issue, whether open or closed, comment on the existing thread instead of posting a new issue.
- [x] For any problems you identify, a minimal reproducible example so the maintainer can troubleshoot. A reproducible example is:
- [x] Runnable: post enough R code and data so any onlooker can create the error on their own computer.
- [x] Minimal: reduce runtime wherever possible and remove complicated details that are irrelevant to the issue at hand.
- [x] Readable: format your code according to the tidyverse style guide.
Question
What would you like to know?
In a traditional gt table without groups - the literal row number is equivalent to the row "ID". However, when using groupname_col in gt - the row ids are maintained as the "input" dataframe even though the represented data is now "out of order".
This can make referring to specific rows by number very inconsistent between the "represented" data in the table and the actual data as passed to gt().
I have written one function that "corrects" for this nuance in tab_style_by_grp, part of the complexity is that the "groups" are created according to when they are first "observed" in the data - so unless passed as an already ordered factor the groups can be unpredictable on top of the ordering of the rows themselves.
It would be great if there was either a helper function to "correct" for this behavior or if the row numbering was always consistent with the final output representation rather than the input dataframe.
I've included an example below of the inconsistent row number/id.
library(gt)
OkabeIto <- c("#E69F00", "#56B4E9", "#009E73", "#F0E442",
"#0072B2", "#D55E00", "#CC79A7", "#999999")
scales::show_col(OkabeIto)

row_sty <- function(tab, row){
OkabeIto <- c("#E69F00", "#56B4E9", "#009E73", "#F0E442",
"#0072B2", "#D55E00", "#CC79A7", "#999999")
tab %>%
tab_style(
cell_fill(color = OkabeIto[row]),
locations = cells_body(rows = row)
)
}
df <- head(mtcars) %>%
dplyr::mutate(row_id = dplyr::row_number(), .before = 1)
tab1 <- gt(df) %>%
row_sty(1) %>%
row_sty(3) %>%
row_sty(5) %>%
gtExtras::gtsave_extra("ordered.png", selector = "table")
magick::image_read("ordered.png")

tab2 <- gt(df, groupname_col = "cyl") %>%
row_sty(1) %>%
row_sty(3) %>%
row_sty(5) %>%
gtExtras::gtsave_extra("scrambled.png", selector = "table")
magick::image_read("scrambled.png")

Created on 2022-05-31 by the reprex package (v2.0.1)
Reproducible example
- [x] For any problems you identify, post a minimal reproducible example so the maintainer can troubleshoot. A reproducible example is:
- [x] Runnable: post enough R code and data so any onlooker can create the error on their own computer.
- [x] Minimal: reduce runtime wherever possible and remove complicated details that are irrelevant to the issue at hand.
- [x] Readable: format your code according to the tidyverse style guide.
I have written an initial function that "corrects" the index for grouped tables.
get_row_index <- function(gt_object){
# get various components
gt_boxhead_df <- gt:::dt_boxhead_get(gt_object)
groups_var_nm <- gt:::dt_boxhead_get_var_by_type(gt_object, "row_group")
gt_internal_df <- gt:::dt_data_get(base_tab)
## if grouped then get both the stub and internal df index
if(length(groups_var_nm) >= 1){
# ordered levels of the row groups
gt_row_grps <- gt:::dt_row_groups_get(gt_object)
# pull the ordered row numbers
grp_vec_ord <- gt:::dt_stub_df_get(gt_object) %>%
dplyr::mutate(group_id = factor(group_id, levels = gt_row_grps)) %>%
dplyr::arrange(group_id) %>%
dplyr::pull(rownum_i)
# get the actual row id of the data for gt to target
row_ids <- gt_internal_df %>%
dplyr::mutate(row_id = dplyr::row_number()) %>%
dplyr::slice(grp_vec_ord) %>%
dplyr::pull(row_id)
} else {
row_ids <- seq_len(nrow(gt_internal_df))
}
# could be applied at user level or used in other internal fns
row_ids
}
Example in use
r
library(dplyr)
library(gt)
set.seed(37)
df <- mtcars |>
group_by(cyl, am) |>
slice_sample(n = 2) |>
ungroup() |>
slice(sample(1:12))
base_tab <- df |>
gt(groupname_col = c("cyl", "am"))
# underlying basic data
tab_data <- base_tab[["_data"]]
# get the full dataframe, properly indexed
index_df <- gtExtras::gt_index(base_tab, 1, as_vector = FALSE)
# get just the row index as a vector
index_vec <- get_row_index(base_tab)
# data is right but order is wrong
all_equal(tab_data, index_df, ignore_row_order = FALSE)
#> [1] "Same row values, but different order"
# data is equivalent
all_equal(tab_data |> slice(index_vec), index_df, ignore_row_order = FALSE)
#> [1] TRUE
Thank you so much for working this out! I do believe some work should go into helping to resolve cells in groups and working with transformed indices.