gtsummary
gtsummary copied to clipboard
Feature request: tbl_svysummary reports p.std.error in percentage instead of proportion
Currently, tbl_svysummary() allows several statistics output for categorical variables. Specifically, the following 2:
p: percentagep.std.error: standard error of the sample proportion computed with [survey::svymean()]
For tables, it makes more sense for these to be on the same scale and I think p.std.error would make more sense to be the "standard error of the sample percentage"
I tried multiplying by 100 in glue but this option does not work.
library(gtsummary)
tbl_svysummary_ex1 <-
survey::svydesign(~1, data = as.data.frame(Titanic), weights = ~Freq) %>%
tbl_svysummary(include = c(Class),
statistic=list(all_categorical()~"{p} ({p.std.error})"))
tbl_svysummary_ex2 <-
survey::svydesign(~1, data = as.data.frame(Titanic), weights = ~Freq) %>%
tbl_svysummary(include = c(Class),
statistic=list(all_categorical()~"{p} ({p.std.error*100})"))
#> Error in `mutate()`:
#> ℹ In argument: `tbl_stats = pmap(...)`.
#> Caused by error in `pmap()`:
#> ℹ In index: 1.
#> Caused by error in `value[[3L]]()`:
#> ! There was an error assembling the summary statistics for 'Class'
#> with summary type 'categorical'.
#>
#> There are 2 common sources for this error.
#> 1. You have requested summary statistics meant for continuous
#> variables for a variable being as summarized as categorical.
#> To change the summary type to continuous, add the argument
#> `type = list(Class ~ 'continuous')`
#> 2. One of the functions or statistics from the `statistic=` argument is not valid.
#> Backtrace:
#> ▆
#> 1. ├─... %>% ...
#> 2. ├─gtsummary::tbl_svysummary(...)
#> 3. │ └─... %>% ...
#> 4. ├─dplyr::select(., "variable", "var_type", "var_label", everything())
#> 5. ├─tidyr::unnest(., "tbl_stats")
#> 6. ├─dplyr::select(., var_type = "summary_type", "var_label", "tbl_stats")
#> 7. ├─dplyr::mutate(...)
#> 8. ├─dplyr:::mutate.data.frame(...)
#> 9. │ └─dplyr:::mutate_cols(.data, dplyr_quosures(...), by)
#> 10. │ ├─base::withCallingHandlers(...)
#> 11. │ └─dplyr:::mutate_col(dots[[i]], data, mask, new_columns)
#> 12. │ └─mask$eval_all_mutate(quo)
#> 13. │ └─dplyr (local) eval()
#> 14. ├─purrr::pmap(...)
#> 15. │ └─purrr:::pmap_("list", .l, .f, ..., .progress = .progress)
#> 16. │ ├─purrr:::with_indexed_errors(...)
#> 17. │ │ └─base::withCallingHandlers(...)
#> 18. │ ├─purrr:::call_with_cleanup(...)
#> 19. │ └─gtsummary (local) .f(...)
#> 20. │ └─gtsummary:::df_stats_to_tbl(...)
#> 21. │ └─base::tryCatch(...)
#> 22. │ └─base (local) tryCatchList(expr, classes, parentenv, handlers)
#> 23. │ └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
#> 24. │ └─value[[3L]](cond)
#> 25. │ └─base::stop(...)
#> 26. └─base::.handleSimpleError(...)
#> 27. └─purrr (local) h(simpleError(msg, call))
#> 28. └─cli::cli_abort(...)
#> 29. └─rlang::abort(...)
Created on 2023-07-16 with reprex v2.0.2
Session info
sessionInfo()
#> R version 4.3.1 (2023-06-16 ucrt)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 11 x64 (build 22621)
#>
#> Matrix products: default
#>
#>
#> locale:
#> [1] LC_COLLATE=English_United States.utf8
#> [2] LC_CTYPE=English_United States.utf8
#> [3] LC_MONETARY=English_United States.utf8
#> [4] LC_NUMERIC=C
#> [5] LC_TIME=English_United States.utf8
#>
#> time zone: America/New_York
#> tzcode source: internal
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] gtsummary_1.7.2
#>
#> loaded via a namespace (and not attached):
#> [1] Matrix_1.5-4.1 dplyr_1.1.2 compiler_4.3.1
#> [4] reprex_2.0.2 tidyselect_1.2.0 xml2_1.3.4
#> [7] stringr_1.5.0 survey_4.2-1 tidyr_1.3.0
#> [10] splines_4.3.1 broom.helpers_1.13.0 yaml_2.3.7
#> [13] fastmap_1.1.1 lattice_0.21-8 R6_2.5.1
#> [16] generics_0.1.3 knitr_1.42 forcats_1.0.0
#> [19] tibble_3.2.1 DBI_1.1.3 R.cache_0.16.0
#> [22] pillar_1.9.0 R.utils_2.12.2 rlang_1.1.1
#> [25] utf8_1.2.3 stringi_1.7.12 xfun_0.39
#> [28] fs_1.6.2 cli_3.6.1 withr_2.5.0
#> [31] magrittr_2.0.3 grid_4.3.1 digest_0.6.31
#> [34] rstudioapi_0.14 lifecycle_1.0.3 R.methodsS3_1.8.2
#> [37] R.oo_1.25.0 vctrs_0.6.2 evaluate_0.21
#> [40] glue_1.6.2 styler_1.10.1 mitools_2.4
#> [43] survival_3.5-5 gt_0.9.0 fansi_1.0.4
#> [46] rmarkdown_2.21 purrr_1.0.1 tools_4.3.1
#> [49] pkgconfig_2.0.3 htmltools_0.5.5
Can you linked to published examples using this suggestion? Thanks
On Sun, Jul 16, 2023, 9:37 AM Stephanie Zimmer @.***> wrote:
Currently, tbl_svysummary() allows several statistics output for categorical variables. Specifically, the following 2:
- p: percentage
- p.std.error: standard error of the sample proportion computed with [survey::svymean()]
For tables, it makes more sense for these to be on the same scale and I think p.std.error would make more sense to be the "standard error of the sample percentage"
I tried multiplying by 100 in glue but this option does not work.
library(gtsummary) tbl_svysummary_ex1 <- survey::svydesign(~1, data = as.data.frame(Titanic), weights = ~Freq) %>% tbl_svysummary(include = c(Class), statistic=list(all_categorical()~"{p} ({p.std.error})")) tbl_svysummary_ex2 <- survey::svydesign(~1, data = as.data.frame(Titanic), weights = ~Freq) %>% tbl_svysummary(include = c(Class), statistic=list(all_categorical()~"{p} ({p.std.error*100})"))#> Error in
mutate():#> ℹ In argument:tbl_stats = pmap(...).#> Caused by error inpmap():#> ℹ In index: 1.#> Caused by error invalue[[3L]]():#> ! There was an error assembling the summary statistics for 'Class'#> with summary type 'categorical'.#> #> There are 2 common sources for this error.#> 1. You have requested summary statistics meant for continuous#> variables for a variable being as summarized as categorical.#> To change the summary type to continuous, add the argument#>type = list(Class ~ 'continuous')#> 2. One of the functions or statistics from thestatistic=argument is not valid.#> Backtrace:#> ▆#> 1. ├─... %>% ...#> 2. ├─gtsummary::tbl_svysummary(...)#> 3. │ └─... %>% ...#> 4. ├─dplyr::select(., "variable", "var_type", "var_label", everything())#> 5. ├─tidyr::unnest(., "tbl_stats")#> 6. ├─dplyr::select(., var_type = "summary_type", "var_label", "tbl_stats")#> 7. ├─dplyr::mutate(...)#> 8. ├─dplyr:::mutate.data.frame(...)#> 9. │ └─dplyr:::mutate_cols(.data, dplyr_quosures(...), by)#> 10. │ ├─base::withCallingHandlers(...)#> 11. │ └─dplyr:::mutate_col(dots[[i]], data, mask, new_columns)#> 12. │ └─mask$eval_all_mutate(quo)#> 13. │ └─dplyr (local) eval()#> 14. ├─purrr::pmap(...)#> 15. │ └─purrr:::pmap_("list", .l, .f, ..., .progress = .progress)#> 16. │ ├─purrr:::with_indexed_errors(...)#> 17. │ │ └─base::withCallingHandlers(...)#> 18. │ ├─purrr:::call_with_cleanup(...)#> 19. │ └─gtsummary (local) .f(...)#> 20. │ └─gtsummary:::df_stats_to_tbl(...)#> 21. │ └─base::tryCatch(...)#> 22. │ └─base (local) tryCatchList(expr, classes, parentenv, handlers)#> 23. │ └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])#> 24. │ └─value[3L]#> 25. │ └─base::stop(...)#> 26. └─base::.handleSimpleError(...)#> 27. └─purrr (local) h(simpleError(msg, call))#> 28. └─cli::cli_abort(...)#> 29. └─rlang::abort(...)Created on 2023-07-16 with reprex v2.0.2 https://reprex.tidyverse.org Session info
sessionInfo()#> R version 4.3.1 (2023-06-16 ucrt)#> Platform: x86_64-w64-mingw32/x64 (64-bit)#> Running under: Windows 11 x64 (build 22621)#> #> Matrix products: default#> #> #> locale:#> [1] LC_COLLATE=English_United States.utf8 #> [2] LC_CTYPE=English_United States.utf8 #> [3] LC_MONETARY=English_United States.utf8#> [4] LC_NUMERIC=C #> [5] LC_TIME=English_United States.utf8 #> #> time zone: America/New_York#> tzcode source: internal#> #> attached base packages:#> [1] stats graphics grDevices utils datasets methods base #> #> other attached packages:#> [1] gtsummary_1.7.2#> #> loaded via a namespace (and not attached):#> [1] Matrix_1.5-4.1 dplyr_1.1.2 compiler_4.3.1 #> [4] reprex_2.0.2 tidyselect_1.2.0 xml2_1.3.4 #> [7] stringr_1.5.0 survey_4.2-1 tidyr_1.3.0 #> [10] splines_4.3.1 broom.helpers_1.13.0 yaml_2.3.7 #> [13] fastmap_1.1.1 lattice_0.21-8 R6_2.5.1 #> [16] generics_0.1.3 knitr_1.42 forcats_1.0.0 #> [19] tibble_3.2.1 DBI_1.1.3 R.cache_0.16.0 #> [22] pillar_1.9.0 R.utils_2.12.2 rlang_1.1.1 #> [25] utf8_1.2.3 stringi_1.7.12 xfun_0.39 #> [28] fs_1.6.2 cli_3.6.1 withr_2.5.0 #> [31] magrittr_2.0.3 grid_4.3.1 digest_0.6.31 #> [34] rstudioapi_0.14 lifecycle_1.0.3 R.methodsS3_1.8.2 #> [37] R.oo_1.25.0 vctrs_0.6.2 evaluate_0.21 #> [40] glue_1.6.2 styler_1.10.1 mitools_2.4 #> [43] survival_3.5-5 gt_0.9.0 fansi_1.0.4 #> [46] rmarkdown_2.21 purrr_1.0.1 tools_4.3.1 #> [49] pkgconfig_2.0.3 htmltools_0.5.5
— Reply to this email directly, view it on GitHub https://github.com/ddsjoberg/gtsummary/issues/1536, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGMIZHHHU5TZUVTR76JTMNLXQQKD3ANCNFSM6AAAAAA2MBHYKI . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Both are proportions. Maybe we should update the documentation to avoid any confusion.
By default, p is styled with style_percent.
When you customize displayed stats, you should also update digits with the appropriate formatter.
Regards
Le dim. 16 juil. 2023 à 19:45, Daniel Sjoberg @.***> a écrit :
Can you linked to published examples using this suggestion? Thanks
On Sun, Jul 16, 2023, 9:37 AM Stephanie Zimmer @.***> wrote:
Currently, tbl_svysummary() allows several statistics output for categorical variables. Specifically, the following 2:
- p: percentage
- p.std.error: standard error of the sample proportion computed with [survey::svymean()]
For tables, it makes more sense for these to be on the same scale and I think p.std.error would make more sense to be the "standard error of the sample percentage"
I tried multiplying by 100 in glue but this option does not work.
library(gtsummary) tbl_svysummary_ex1 <- survey::svydesign(~1, data = as.data.frame(Titanic), weights = ~Freq) %>% tbl_svysummary(include = c(Class), statistic=list(all_categorical()~"{p} ({p.std.error})")) tbl_svysummary_ex2 <- survey::svydesign(~1, data = as.data.frame(Titanic), weights = ~Freq) %>% tbl_svysummary(include = c(Class), statistic=list(all_categorical()~"{p} ({p.std.error*100})"))#> Error in
mutate():#> ℹ In argument:tbl_stats = pmap(...).#> Caused by error inpmap():#> ℹ In index: 1.#> Caused by error invalue[[3L]]():#> ! There was an error assembling the summary statistics for 'Class'#> with summary type 'categorical'.#> #> There are 2 common sources for this error.#> 1. You have requested summary statistics meant for continuous#> variables for a variable being as summarized as categorical.#> To change the summary type to continuous, add the argument#>type = list(Class ~ 'continuous')#> 2. One of the functions or statistics from thestatistic=argument is not valid.#> Backtrace:#> ▆#> 1. ├─... %>% ...#> 2. ├─gtsummary::tbl_svysummary(...)#> 3. │ └─... %>% ...#> 4. ├─dplyr::select(., "variable", "var_type", "var_label", everything())#> 5. ├─tidyr::unnest(., "tbl_stats")#> 6. ├─dplyr::select(., var_type = "summary_type", "var_label", "tbl_stats")#> 7. ├─dplyr::mutate(...)#> 8. ├─dplyr:::mutate.data.frame(...)#> 9. │ └─dplyr:::mutate_cols(.data, dplyr_quosures(...), by)#> 10. │ ├─base::withCallingHandlers(...)#> 11. │ └─dplyr:::mutate_col(dots[[i]], data, mask, new_columns)#> 12. │ └─mask$eval_all_mutate(quo)#> 13. │ └─dplyr (local) eval()#> 14. ├─purrr::pmap(...)#> 15. │ └─purrr:::pmap_("list", .l, .f, ..., .progress = .progress)#> 16. │ ├─purrr:::with_indexed_errors(...)#> 17. │ │ └─base::withCallingHandlers(...)#> 18. │ ├─purrr:::call_with_cleanup(...)#>
- │ └─gtsummary (local) .f(...)#> 20. │ └─gtsummary:::df_stats_to_tbl(...)#> 21. │ └─base::tryCatch(...)#> 22. │ └─base (local) tryCatchList(expr, classes, parentenv, handlers)#> 23. │ └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])#> 24. │ └─value[3L]#> 25. │ └─base::stop(...)#> 26. └─base::.handleSimpleError(...)#> 27. └─purrr (local) h(simpleError(msg, call))#> 28. └─cli::cli_abort(...)#> 29. └─rlang::abort(...)
Created on 2023-07-16 with reprex v2.0.2 https://reprex.tidyverse.org Session info
sessionInfo()#> R version 4.3.1 (2023-06-16 ucrt)#> Platform: x86_64-w64-mingw32/x64 (64-bit)#> Running under: Windows 11 x64 (build 22621)#> #> Matrix products: default#> #> #> locale:#> [1] LC_COLLATE=English_United States.utf8 #> [2] LC_CTYPE=English_United States.utf8 #> [3] LC_MONETARY=English_United States.utf8#> [4] LC_NUMERIC=C #> [5] LC_TIME=English_United States.utf8 #> #> time zone: America/New_York#> tzcode source: internal#> #> attached base packages:#> [1] stats graphics grDevices utils datasets methods base #> #> other attached packages:#> [1] gtsummary_1.7.2#> #> loaded via a namespace (and not attached):#> [1] Matrix_1.5-4.1 dplyr_1.1.2 compiler_4.3.1 #> [4] reprex_2.0.2 tidyselect_1.2.0 xml2_1.3.4 #> [7] stringr_1.5.0 survey_4.2-1 tidyr_1.3.0 #> [10] splines_4.3.1 broom.helpers_1.13.0 yaml_2.3.7 #> [13] fastmap_1.1.1 lattice_0.21-8 R6_2.5.1 #> [16] generics_0.1.3 knitr_1.42 forcats_1.0.0 #> [19] tibble_3.2.1 DBI_1.1.3 R.cache_0.16.0 #> [22] pillar_1.9.0 R.utils_2.12.2 rlang_1.1.1 #> [25] utf8_1.2.3 stringi_1.7.12 xfun_0.39 #> [28] fs_1.6.2 cli_3.6.1 withr_2.5.0 #> [31] magrittr_2.0.3 grid_4.3.1 digest_0.6.31 #> [34] rstudioapi_0.14 lifecycle_1.0.3 R.methodsS3_1.8.2 #> [37] R.oo_1.25.0 vctrs_0.6.2 evaluate_0.21 #> [40] glue_1.6.2 styler_1.10.1 mitools_2.4 #> [43] survival_3.5-5 gt_0.9.0 fansi_1.0.4 #> [46] rmarkdown_2.21 purrr_1.0.1 tools_4.3.1 #> [49] pkgconfig_2.0.3 htmltools_0.5.5
— Reply to this email directly, view it on GitHub https://github.com/ddsjoberg/gtsummary/issues/1536, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AGMIZHHHU5TZUVTR76JTMNLXQQKD3ANCNFSM6AAAAAA2MBHYKI>
. You are receiving this because you are subscribed to this thread.Message ID: @.***>
— Reply to this email directly, view it on GitHub https://github.com/ddsjoberg/gtsummary/issues/1536#issuecomment-1637135952, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHL5IZGPL2KG3WWZHIBBVLXQQLEJANCNFSM6AAAAAA2MBHYKI . You are receiving this because you are subscribed to this thread.Message ID: @.***>
-- Joseph Larmarange
Hi @szimmer ,
Re-reading this, and I think I mis-read the first time. If you'd like to change the formatting for the percent standard error, you can use the digits argument to change the rounding. Example below!
library(gtsummary)
survey::svydesign(~1, data = as.data.frame(Titanic), weights = ~Freq) %>%
tbl_svysummary(
include = c(Class),
statistic=list(all_categorical()~"{p} ({p.std.error})"),
digits = all_categorical() ~ list(0, \(x) style_number(x, scale = 100, digits = 2))
) |>
as_kable() # convert to kable to display on GH
| Characteristic | N = 2,201 |
|---|---|
| Class | |
| 1st | 15 (9.43) |
| 2nd | 13 (8.63) |
| 3rd | 32 (17.01) |
| Crew | 40 (21.27) |
Created on 2023-10-08 with reprex v2.0.2
Hi @szimmer @larmarange ,
I agree with @larmarange that a bit more document would be the way to go. When we start scaling variances and standard errors, we need to be more careful, and I would prefer to leave that to the user. This is how they are currently documented.
If you'd like, please submit a pull request with the proposed up. Thanks!