gtsummary icon indicating copy to clipboard operation
gtsummary copied to clipboard

Feature Request: tbl_svysummary() support with replicate weight survey design

Open rbcavanaugh opened this issue 2 years ago • 5 comments

Do not use this form to ask a question, or ask for assistance. Instead, ask on https://stackoverflow.com/ using the gtsummary tag. Questions about a function's use will be closed without a response.

If you have found a bug, please briefly describe your problem and what output you expect.

INCLUDE a minimal reproducible example (AKA a reprex). If you've never heard of a reprex before, start by reading https://www.tidyverse.org/help/#reprex. It'll take minutes to master creating a reprex. ISSUES WITHOUT A REPRODUCIBLE EXAMPLE WILL LIKELY BE CLOSED WITHOUT A RESPONSE.


tbl_svysummary() returns an error with replicate weight survey design. I am not terribly familiar with how the survey package interfaces with gtsummary these packages are new to me. Apologies if I've missed something in the documentation. Working with a new-to-me script that supposedly ran at one point and no longer does.

Notes: I can get tbl_svysummary() to work on the same dataset without replicate weights (using a standard survey object), as well as with tbl_summary() and the raw data. Reprex is from the sample here: https://r-survey.r-forge.r-project.org/survey/html/svrepdesign.html.

library(survey)
#> Loading required package: grid
#> Loading required package: Matrix
#> Loading required package: survival
#> 
#> Attaching package: 'survey'
#> The following object is masked from 'package:graphics':
#> 
#>     dotchart
library(gtsummary)

data(scd)
# use BRR replicate weights from Levy and Lemeshow
repweights<-2*cbind(c(1,0,1,0,1,0), c(1,0,0,1,0,1), c(0,1,1,0,0,1),
                    c(0,1,0,1,1,0))

scdrep<-svrepdesign(data=scd, type="BRR", repweights=repweights, combined.weights=FALSE)
#> Warning in svrepdesign.default(data = scd, type = "BRR", repweights =
#> repweights, : No sampling weights provided: equal probability assumed

scdrep |> 
    tbl_svysummary(
        by = ambulance,
        include = c(
            arrests
        )
    )
#> Warning in svrVar(repmeans, scale, rscales, mse = design$mse, coef = rval): 1
#> replicates gave NA results and were discarded.
#> Error in `mutate()`:
#> ! Problem while computing `df_stats = pmap(...)`.
#> Caused by error in `pmap()`:
#> ℹ In index: 1.
#> Caused by error in `set_names()`:
#> ! The size of `nm` (4) must be compatible with the size of `x` (3).

#> Backtrace:
#>      ▆
#>   1. ├─gtsummary::tbl_svysummary(scdrep, by = ambulance, include = c(arrests))
#>   2. │ └─gtsummary:::generate_metadata(...)
#>   3. │   └─meta_data %>% ...
#>   4. ├─dplyr::mutate(...)
#>   5. ├─dplyr:::mutate.data.frame(...)
#>   6. │ └─dplyr:::mutate_cols(.data, dplyr_quosures(...), caller_env = caller_env())
#>   7. │   ├─base::withCallingHandlers(...)
#>   8. │   └─mask$eval_all_mutate(quo)
#>   9. ├─purrr::pmap(...)
#>  10. │ └─purrr:::pmap_("list", .l, .f, ..., .progress = .progress)
#>  11. │   ├─purrr:::with_indexed_errors(...)
#>  12. │   │ └─base::withCallingHandlers(...)
#>  13. │   └─gtsummary (local) .f(...)
#>  14. │     └─gtsummary (local) df_stats_function(...)
#>  15. │       └─gtsummary:::summarize_categorical_survey(...)
#>  16. │         └─... %>% ...
#>  17. ├─rlang::set_names(., c("by", "variable_levels", "p", "p.std.error"))
#>  18. └─rlang::abort(message = message)

sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.2.2 (2022-10-31)
#>  os       macOS Monterey 12.6
#>  system   aarch64, darwin20
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       America/New_York
#>  date     2023-01-09
#>  pandoc   2.19.2 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package       * version date (UTC) lib source
#>  assertthat      0.2.1   2019-03-21 [1] CRAN (R 4.2.0)
#>  broom.helpers   1.11.0  2023-01-06 [1] CRAN (R 4.2.2)
#>  cli             3.5.0   2022-12-20 [1] CRAN (R 4.2.0)
#>  colorspace      2.0-3   2022-02-21 [1] CRAN (R 4.2.0)
#>  DBI             1.1.3   2022-06-18 [1] CRAN (R 4.2.0)
#>  digest          0.6.31  2022-12-11 [1] CRAN (R 4.2.0)
#>  dplyr           1.0.10  2022-09-01 [1] CRAN (R 4.2.0)
#>  ellipsis        0.3.2   2021-04-29 [1] CRAN (R 4.2.0)
#>  evaluate        0.19    2022-12-13 [1] CRAN (R 4.2.0)
#>  fansi           1.0.3   2022-03-24 [1] CRAN (R 4.2.0)
#>  fastmap         1.1.0   2021-01-25 [1] CRAN (R 4.2.0)
#>  forcats         0.5.2   2022-08-19 [1] CRAN (R 4.2.0)
#>  fs              1.5.2   2021-12-08 [1] CRAN (R 4.2.0)
#>  generics        0.1.3   2022-07-05 [1] CRAN (R 4.2.0)
#>  ggplot2         3.4.0   2022-11-04 [1] CRAN (R 4.2.0)
#>  glue            1.6.2   2022-02-24 [1] CRAN (R 4.2.0)
#>  gt              0.8.0   2022-11-16 [1] CRAN (R 4.2.0)
#>  gtable          0.3.1   2022-09-01 [1] CRAN (R 4.2.0)
#>  gtsummary     * 1.6.3   2022-12-06 [1] CRAN (R 4.2.0)
#>  highr           0.10    2022-12-22 [1] CRAN (R 4.2.0)
#>  htmltools       0.5.4   2022-12-07 [1] CRAN (R 4.2.0)
#>  knitr           1.41    2022-11-18 [1] CRAN (R 4.2.0)
#>  lattice         0.20-45 2021-09-22 [1] CRAN (R 4.2.2)
#>  lifecycle       1.0.3   2022-10-07 [1] CRAN (R 4.2.0)
#>  magrittr        2.0.3   2022-03-30 [1] CRAN (R 4.2.0)
#>  Matrix        * 1.5-1   2022-09-13 [1] CRAN (R 4.2.2)
#>  mitools         2.4     2019-04-26 [1] CRAN (R 4.2.0)
#>  munsell         0.5.0   2018-06-12 [1] CRAN (R 4.2.0)
#>  pillar          1.8.1   2022-08-19 [1] CRAN (R 4.2.0)
#>  pkgconfig       2.0.3   2019-09-22 [1] CRAN (R 4.2.0)
#>  purrr           1.0.0   2022-12-20 [1] CRAN (R 4.2.0)
#>  R6              2.5.1   2021-08-19 [1] CRAN (R 4.2.0)
#>  reprex          2.0.2   2022-08-17 [1] CRAN (R 4.2.0)
#>  rlang           1.0.6   2022-09-24 [1] CRAN (R 4.2.0)
#>  rmarkdown       2.18    2022-11-09 [1] CRAN (R 4.2.0)
#>  rstudioapi      0.14    2022-08-22 [1] CRAN (R 4.2.0)
#>  scales          1.2.1   2022-08-20 [1] CRAN (R 4.2.0)
#>  sessioninfo     1.2.2   2021-12-06 [1] CRAN (R 4.2.0)
#>  stringi         1.7.8   2022-07-11 [1] CRAN (R 4.2.0)
#>  stringr         1.5.0   2022-12-02 [1] CRAN (R 4.2.0)
#>  survey        * 4.1-1   2021-07-19 [1] CRAN (R 4.2.0)
#>  survival      * 3.4-0   2022-08-09 [1] CRAN (R 4.2.2)
#>  tibble          3.1.8   2022-07-22 [1] CRAN (R 4.2.0)
#>  tidyr           1.2.1   2022-09-08 [1] CRAN (R 4.2.0)
#>  tidyselect      1.2.0   2022-10-10 [1] CRAN (R 4.2.0)
#>  utf8            1.2.2   2021-07-24 [1] CRAN (R 4.2.0)
#>  vctrs           0.5.1   2022-11-16 [1] CRAN (R 4.2.0)
#>  withr           2.5.0   2022-03-03 [1] CRAN (R 4.2.0)
#>  xfun            0.36    2022-12-21 [1] CRAN (R 4.2.0)
#>  yaml            2.3.6   2022-10-18 [1] CRAN (R 4.2.0)
#> 
#>  [1] /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

Created on 2023-01-09 with reprex v2.0.2

rbcavanaugh avatar Jan 09 '23 17:01 rbcavanaugh

Looks like this is because the svrepdesign() function does not append a period after se so the following code never finds a condition for "p.std.error", which results in not enough columns for set_names() after the pivot_wider() call.

mutate(
          stat = if_else(
            str_starts(.data$name, paste0("se.", variable)) | str_starts(.data$name, paste0("se.`", variable, "`")),
            "p.std.error",
            "p"
          ),

rbcavanaugh avatar Jan 10 '23 16:01 rbcavanaugh

That is great investigatory work!

Are you interested in submitting a pull requst? @larmarange are you open to an update that would handle svrepdesign() objects as well?

ddsjoberg avatar Jan 10 '23 20:01 ddsjoberg

I'm not sure if I quite understand the the pieces going into the code enough but I will take a look. I also realized that the svrepdesign() returns something like se1 instead of se.variableName so there's an additional bit of re-attaching the variable name as well.

rbcavanaugh avatar Jan 10 '23 20:01 rbcavanaugh

Are you interested in submitting a pull requst? @larmarange are you open to an update that would handle svrepdesign() objects as well?

No problem. 😊

larmarange avatar Jan 11 '23 06:01 larmarange

Great, I'll leave this open until someone is inspired to submit a pull request 😝

ddsjoberg avatar Jan 12 '23 22:01 ddsjoberg

Hi @larmarange and @rbcavanaugh ,

Reviewing some of the lingering issues and wanted to give an update. In order to make this work, we'd need to have a version of each of the functions listed here to work with replicate weights designs. https://insightsengineering.github.io/cardx/main/reference/index.html#-survey-package

I think this would be a big undertaking and no one has jumped on it yet. I am going to go ahead and close this issue. But if anyone in the future wants to make the effort, please reach out before getting started.

ddsjoberg avatar Jul 11 '24 16:07 ddsjoberg

I do agree. Thanks

larmarange avatar Jul 11 '24 20:07 larmarange