gt icon indicating copy to clipboard operation
gt copied to clipboard

tab_spanner_delim() issue with updated method.

Open hathawayj opened this issue 2 years ago • 2 comments

Prework

  • [x] Read and agree to the code of conduct and contributing guidelines.
  • [x] If there is already a relevant issue, whether open or closed, comment on the existing thread instead of posting a new issue.
  • [x] Post a minimal reproducible example so the maintainer can troubleshoot the problems you identify. A reproducible example is:
    • [x] Runnable: post enough R code and data so any onlooker can create the error on their own computer.
    • [x] Minimal: reduce runtime wherever possible and remove complicated details that are irrelevant to the issue at hand.
    • [x] Readable: format your code according to the tidyverse style guide.

Description

The tab_spanner_delim() change from the February code introduced an error in creating the nested spans using column labels with three separations. I know get an error, shown below, when tab_spanner_delim() executes tab_spanner() in the i,j loop of the spanner_matrix.

Error: The spanner `id` provided (`"MITS.Age"`) is not unique:
* The `id` must be unique across existing spanners
* Provide a unique ID value for this spanner

Reproducible example

  • [x] Post a minimal reproducible example so the maintainer can troubleshoot the problems you identify. A reproducible example is:
    • [x] Runnable: post enough R code and data so any onlooker can create the error on their own computer.
    • [x] Minimal: reduce runtime wherever possible and remove complicated details that are irrelevant to the issue at hand.
    • [x] Readable: format your code according to the tidyverse style guide.
# Setup
library(tidyverse)
library(gt)

percent_digits <- 1

dat_wide <- dat_side <- structure(list(location_name = c("S5: C3, C4, C5", "S6: C1", 
"S7: C6, C7"), year_period = c("2017-2020", "2017-2020", "2017-2020"
), dss = c("(DSS)", "(DSS)", "(DSS)"), `Age.MITS.P-value` = c(0.001, 
0.91954080812264, 0.001), `Sex.MITS.P-value` = c(0.3066971066355, 
0.42106770737801, 0.567973191903632), `Education.MITS.P-value` = c(0.001, 
0.749149327041164, 0.001), `Season.MITS.P-value` = c(0.384513263115074, 
0.491676742761848, 0.0218508574481999), `Location.MITS.P-value` = c(0.001, 
0.001, 0.001), `VA CoD.MITS.P-value` = c(0.001, 0.00435893883701637, 
0.0571001694408351), `Age.Lower respiratory infections.P-value` = c(0.0438072177710105, 
1, 0.70261787250106), `Sex.Lower respiratory infections.P-value` = c(0.607219131310109, 
1, 0.832536994826216), `Education.Lower respiratory infections.P-value` = c(1, 
1, 0.297205882049848), `Season.Lower respiratory infections.P-value` = c(1, 
1, 0.529844740263814), `Location.Lower respiratory infections.P-value` = c(0.0352794743786813, 
1, 0.0852972003753499), `VA CoD.Lower respiratory infections.P-value` = c(1, 
1, 0.0155970592961164), Age.MITS.Missing = c(0, 0, 0), Sex.MITS.Missing = c(34.5654993514916, 
1.63370593293207, 0.481000481000481), Education.MITS.Missing = c(25.0972762645914, 
51.2467755803955, 53.4872534872535), Season.MITS.Missing = c(0.12970168612192, 
0.171969045571797, 0), Location.MITS.Missing = c(10.8949416342412, 
0.859845227858985, 18.1337181337181), `VA CoD.MITS.Missing` = c(32.2997416020672, 
13.3, 11.3868613138686), `Age.Lower respiratory infections.Missing` = c(0, 
0, 0), `Sex.Lower respiratory infections.Missing` = c(0, 1.20481927710843, 
0.808080808080808), `Education.Lower respiratory infections.Missing` = c(65.4545454545455, 
57.8313253012048, 61.8181818181818), `Season.Lower respiratory infections.Missing` = c(0, 
0, 0), `Location.Lower respiratory infections.Missing` = c(0, 
0, 0), `VA CoD.Lower respiratory infections.Missing` = c(8.48484848484848, 
4.81927710843374, 11.1111111111111)), row.names = c(NA, -3L), class = c("tbl_df", 
"tbl", "data.frame"))


# Old function from https://github.com/rstudio/gt/blob/8a306326cd63de71c7d887dc3706fc0ec1c553c9/R/tab_create_modify.R
# 8a30632
tab_spanner_delim_old <- function(data,
                              delim,
                              columns = everything(),
                              gather = TRUE,
                              split = c("last", "first")) {

  # Perform input object validation
  gt:::stop_if_not_gt(data = data)

  split <- match.arg(split)

  # Get all of the columns in the dataset
  all_cols <- gt:::dt_boxhead_get_vars(data = data)

  # Get the columns supplied in `columns` as a character vector
  columns <-
    gt:::resolve_cols_c(
      expr = {{ columns }},
      data = data
    )

  if (!is.null(columns)) {
    colnames <- base::intersect(all_cols, columns)
  } else {
    colnames <- all_cols
  }

  if (length(colnames) == 0) {
    return(data)
  }

  colnames_has_delim <- grepl(pattern = delim, x = colnames, fixed = TRUE)

  if (any(colnames_has_delim)) {

    colnames_with_delim <- colnames[colnames_has_delim]

    # Perform regexec match where the delimiter is either declared
    # to be the 'first' instance or the 'last' instance
    regexec_m <-
      regexec(
        paste0(
          "^(.*",
          ifelse(split == "first", "?", ""),
          ")\\Q", delim, "\\E(.*)$"
        ),
        colnames_with_delim
      )

    split_colnames <-
      lapply(regmatches(colnames_with_delim, regexec_m), FUN = `[`, 2:3)

    spanners <- vapply(split_colnames, FUN.VALUE = character(1), `[[`, 1)

    spanner_var_list <- split(colnames_with_delim, spanners)

    for (label in names(spanner_var_list)) {

      data <-
        gt::tab_spanner(
          data = data,
          label = label,
          columns = spanner_var_list[[label]],
          gather = gather
        )
    }

    new_labels <-
      lapply(split_colnames, `[[`, -1) %>%
      vapply(paste0, FUN.VALUE = character(1), collapse = delim)

    for (i in seq_along(split_colnames)) {

      new_labels_i <- new_labels[i]
      var_i <- colnames_with_delim[i]

      data <-
        gt:::dt_boxhead_edit(
          data = data,
          var = var_i,
          column_label = new_labels_i
        )
    }
  }

  data
}


# Build the base table
gt_table <- dat_wide %>%
    gt::gt() %>%
    gt::fmt_number(
      columns = dplyr::contains("value"),
      decimals = 3,
      use_seps = FALSE
    ) %>%
    gt::fmt_number(
      columns = dplyr::contains("Missing"),
      decimals = percent_digits,
      use_seps = FALSE
    ) %>%
    # This doesn't work when we join the columns
    # need to figure this part out.
    gt::text_transform(
      locations = gt::cells_body(columns = dplyr::contains("value")),
      fn = function(x) {
        ifelse(x <= .001, "<0.001", x)
      }
    ) %>%
    gt::cols_merge(
      columns = c("location_name", "dss", "year_period"),
      pattern = "<b>{1}</b> <span style='color: gray;'>{2}</span><br><em>{3}</em>"
    )

# old function works as expected
gt_table %>%
    tab_spanner_delim_old(
      delim = ".",
      split = "first"
    )

# new delim function errors
gt_table %>%
    tab_spanner_delim(
      delim = ".",
      split = "first"
    )

# Can execute but not right with split = "last"
gt_table %>%
    tab_spanner_delim(
      delim = ".",
      split = "last"
    )

Expected result

No error. The below screen shot shows the first part of the output using the old function.

Screen Shot 2022-08-09 at 12 27 50 PM

Session info

R version 4.1.1 (2021-08-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Monterey 12.5

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] gt_0.6.0        forcats_0.5.1   stringr_1.4.0   dplyr_1.0.9     purrr_0.3.4     readr_2.1.2     tidyr_1.2.0     tibble_3.1.8    ggplot2_3.3.6   tidyverse_1.3.2

loaded via a namespace (and not attached):
 [1] pillar_1.8.0        compiler_4.1.1      cellranger_1.1.0    dbplyr_2.2.1        tools_4.1.1         digest_0.6.29       checkmate_2.1.0     lubridate_1.8.0    
 [9] jsonlite_1.8.0      googledrive_2.0.0   lifecycle_1.0.1     gargle_1.2.0        gtable_0.3.0        pkgconfig_2.0.3     rlang_1.0.4         reprex_2.0.1       
[17] DBI_1.1.3           cli_3.3.0           haven_2.5.0         fastmap_1.1.0       xml2_1.3.3          withr_2.5.0         httr_1.4.3          sass_0.4.2         
[25] generics_0.1.3      vctrs_0.4.1         fs_1.5.2            hms_1.1.1           googlesheets4_1.0.0 grid_4.1.1          tidyselect_1.1.2    glue_1.6.2         
[33] R6_2.5.1            fansi_1.0.3         readxl_1.4.0        tzdb_0.3.0          modelr_0.1.8        magrittr_2.0.3      htmltools_0.5.3     backports_1.4.1    
[41] scales_1.2.0        ellipsis_0.3.2      rvest_1.0.2         assertthat_0.2.1    colorspace_2.0-3    utf8_1.2.2          stringi_1.7.8       munsell_0.5.0      
[49] broom_1.0.0         crayon_1.5.1   

hathawayj avatar Aug 09 '22 18:08 hathawayj

Thanks for filing this issue and supplying reproducible code! Also, sorry about the problem you're facing. I will work on a fix for this v. soon.

rich-iannone avatar Aug 09 '22 20:08 rich-iannone

I appreciate the quick response. I love the gt grammar.

hathawayj avatar Aug 09 '22 21:08 hathawayj