gt
gt copied to clipboard
Aggregation of column summary feature requests
Prework
- [x] Read and abide by gt's code of conduct and contributing guidelines.
- [x] Search for duplicates among the existing issues (both open and closed).
Duplicates or related issues
- #382 - add
summary_column()to matchsummary_rows() - #632 - specify subset of columns to summarize ACROSS column in a
rowwise()fashion - #690 - when calculating a summary of a specific column, reference another column
- #952 - reference multiple columns in a summary - potentially generate a new column
Proposal
Provide an equivalent to gt::summary_rows() but for columns (ie dplyr::rowwise() operation). This is "possible" today but not as low-level as it could be and relies on a somewhat clunky gtExtras::duplicate_columns() to create a duplicate column and modify it in place, rather than a gt-native approach.
Components working:
- grouped/ungrouped data column
rowwise()operations - Basic support for any summarizing function by
nameor"name" - control placement of output column
Components that will still need to be hashed out:
- Optionally affect summary row outputs (currently ignored)
summary_rows()to optionally reference column summaries- Convert to proper
gtinternals rather thangtExtras::duplicate_column()hack - Should it be added to last column of table with visual separation like
summary_rows()double line
gt_sum_column() function definition
library(gt)
library(gtExtras)
library(dplyr,w=F)
gt_sum_column <- function(gt_object, columns, fn = sum, name = "sum_col",
after = dplyr::last_col()){
summary_fn <- function(all_df, sum_type){
all_df %>%
rowwise() %>%
mutate(sum_col = do.call(sum_type, list(c_across({{ columns }})))) %>%
ungroup() %>%
pull(sum_col)
}
res_col_names <- gt:::resolve_cols_c(
expr = {{ columns }},
data = gt_object
)
gt_object <- gtExtras::gt_duplicate_column(
gt_object,
column = res_col_names[1],
after = {{ after }},
dupe_name = name
)
gt_object[["_data"]] <-
gt_object[["_data"]] %>%
dplyr::rowwise() %>%
dplyr::mutate({{ name }} := do.call(fn, list(dplyr::c_across({{ columns }})))) %>%
dplyr::ungroup()
gt_object
}
base_gt <- dplyr::tibble(
group = c(rep("A", 3), rep("B", 2)),
a = 1:5,
b = 5:1,
c = seq(0.1, 0.5, length.out = 5)
) %>%
gt(groupname_col = "group")
base_gt %>%
gt_sum_column(c(a:c), fn = max, after = c)
base_gt %>%
gt_sum_column(c(a:c), fn = min, after = c)
base_gt %>%
gt_sum_column(c(a:c), fn = "mean", after = c)
base_gt %>%
gt_sum_column(c(a:c), fn = "sum", after = c)
Created on 2022-06-17 by the reprex package (v2.0.1)
To help us read any code you include (optional) please try to follow the tidyverse style guide. The style_text() and style_file() functions from the styler package make it easier.