report icon indicating copy to clipboard operation
report copied to clipboard

report(stan_glm) produces output twice

Open fschaffner opened this issue 2 years ago • 7 comments

Hi, thanks for creating and maintaining the package!

Reporting stan_glm() models currently produces the same output twice. See this example from your README:

library(report)
library(rstanarm)
#> Loading required package: Rcpp
#> This is rstanarm version 2.21.3
#> - See https://mc-stan.org/rstanarm/articles/priors for changes to default priors!
#> - Default priors may change, so it's safest to specify priors, even if equivalent to the defaults.
#> - For execution on a local, multicore CPU with excess RAM we recommend calling
#>   options(mc.cores = parallel::detectCores())

model <- stan_glm(mpg ~ qsec + wt, data = mtcars, refresh = 0)
report(model)
#> We fitted a Bayesian linear model (estimated using MCMC sampling with 4 chains
#> of 2000 iterations and a warmup of 1000) to predict mpg with qsec and wt
#> (formula: mpg ~ qsec + wt). Priors over parameters were set as normal (mean =
#> 0.00, SD = 8.43) distributions. The model's explanatory power is substantial
#> (R2 = 0.81, 95% CI [0.70, 0.89], adj. R2 = 0.79). The model's intercept,
#> corresponding to qsec = 0 and wt = 0, is at 19.74 (95% CI [9.11, 30.04]).
#> Within this model:
#> 
#>   - The effect of qsec (Median = 0.93, 95% CI [0.41, 1.45]) has a 99.98%
#> probability of being positive (> 0), 99.10% of being significant (> 0.30), and
#> 0.22% of being large (> 1.81). The estimation successfully converged (Rhat =
#> 1.000) and the indices are reliable (ESS = 3644)
#>   - The effect of wt (Median = -5.05, 95% CI [-6.06, -4.01]) has a 100.00%
#> probability of being negative (< 0), 100.00% of being significant (< -0.30),
#> and 100.00% of being large (< -1.81). The estimation successfully converged
#> (Rhat = 0.999) and the indices are reliable (ESS = 4002)
#> 
#> Following the Sequential Effect eXistence and sIgnificance Testing (SEXIT)
#> framework, we report the median of the posterior distribution and its 95% CI
#> (Highest Density Interval), along the probability of direction (pd), the
#> probability of significance and the probability of being large. The thresholds
#> beyond which the effect is considered as significant (i.e., non-negligible) and
#> large are |0.30| and |1.81| (corresponding respectively to 0.05 and 0.30 of the
#> outcome's SD). Convergence and stability of the Bayesian sampling has been
#> assessed using R-hat, which should be below 1.01 (Vehtari et al., 2019), and
#> Effective Sample Size (ESS), which should be greater than 1000 (Burkner, 2017).
#> and We fitted a Bayesian linear model (estimated using MCMC sampling with 4
#> chains of 2000 iterations and a warmup of 1000) to predict mpg with qsec and wt
#> (formula: mpg ~ qsec + wt). Priors over parameters were set as normal (mean =
#> 0.00, SD = 15.40) distributions. The model's explanatory power is substantial
#> (R2 = 0.81, 95% CI [0.70, 0.89], adj. R2 = 0.79). The model's intercept,
#> corresponding to qsec = 0 and wt = 0, is at 19.74 (95% CI [9.11, 30.04]).
#> Within this model:
#> 
#>   - The effect of qsec (Median = 0.93, 95% CI [0.41, 1.45]) has a 99.98%
#> probability of being positive (> 0), 99.10% of being significant (> 0.30), and
#> 0.22% of being large (> 1.81). The estimation successfully converged (Rhat =
#> 1.000) and the indices are reliable (ESS = 3644)
#>   - The effect of wt (Median = -5.05, 95% CI [-6.06, -4.01]) has a 100.00%
#> probability of being negative (< 0), 100.00% of being significant (< -0.30),
#> and 100.00% of being large (< -1.81). The estimation successfully converged
#> (Rhat = 0.999) and the indices are reliable (ESS = 4002)
#> 
#> Following the Sequential Effect eXistence and sIgnificance Testing (SEXIT)
#> framework, we report the median of the posterior distribution and its 95% CI
#> (Highest Density Interval), along the probability of direction (pd), the
#> probability of significance and the probability of being large. The thresholds
#> beyond which the effect is considered as significant (i.e., non-negligible) and
#> large are |0.30| and |1.81| (corresponding respectively to 0.05 and 0.30 of the
#> outcome's SD). Convergence and stability of the Bayesian sampling has been
#> assessed using R-hat, which should be below 1.01 (Vehtari et al., 2019), and
#> Effective Sample Size (ESS), which should be greater than 1000 (Burkner, 2017).

Created on 2022-09-09 with reprex v2.0.2

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.2.1 (2022-06-23)
#>  os       macOS Big Sur ... 10.16
#>  system   x86_64, darwin17.0
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       Europe/Zurich
#>  date     2022-09-09
#>  pandoc   2.18 @ /Applications/RStudio.app/Contents/MacOS/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package      * version    date (UTC) lib source
#>  assertthat     0.2.1      2019-03-21 [1] CRAN (R 4.2.0)
#>  base64enc      0.1-3      2015-07-28 [1] CRAN (R 4.2.0)
#>  bayesplot      1.9.0      2022-03-10 [1] CRAN (R 4.2.0)
#>  bayestestR     0.12.1     2022-05-02 [1] CRAN (R 4.2.0)
#>  boot           1.3-28     2021-05-03 [1] CRAN (R 4.2.1)
#>  callr          3.7.2      2022-08-22 [1] CRAN (R 4.2.0)
#>  cli            3.4.0      2022-09-08 [1] CRAN (R 4.2.1)
#>  coda           0.19-4     2020-09-30 [1] CRAN (R 4.2.0)
#>  codetools      0.2-18     2020-11-04 [1] CRAN (R 4.2.1)
#>  colorspace     2.0-3      2022-02-21 [1] CRAN (R 4.2.0)
#>  colourpicker   1.1.1      2021-10-04 [1] CRAN (R 4.2.0)
#>  crayon         1.5.1      2022-03-26 [1] CRAN (R 4.2.0)
#>  crosstalk      1.2.0      2021-11-04 [1] CRAN (R 4.2.0)
#>  datawizard     0.5.1      2022-08-17 [1] CRAN (R 4.2.1)
#>  DBI            1.1.3      2022-06-18 [1] CRAN (R 4.2.0)
#>  digest         0.6.29     2021-12-01 [1] CRAN (R 4.2.0)
#>  dplyr          1.0.10     2022-09-01 [1] CRAN (R 4.2.0)
#>  DT             0.24       2022-08-09 [1] CRAN (R 4.2.0)
#>  dygraphs       1.1.1.6    2018-07-11 [1] CRAN (R 4.2.0)
#>  effectsize     0.7.0.5    2022-08-10 [1] CRAN (R 4.2.0)
#>  ellipsis       0.3.2      2021-04-29 [1] CRAN (R 4.2.0)
#>  emmeans        1.8.1-1    2022-09-08 [1] CRAN (R 4.2.1)
#>  estimability   1.4.1      2022-08-05 [1] CRAN (R 4.2.0)
#>  evaluate       0.16       2022-08-09 [1] CRAN (R 4.2.0)
#>  fansi          1.0.3      2022-03-24 [1] CRAN (R 4.2.0)
#>  fastmap        1.1.0      2021-01-25 [1] CRAN (R 4.2.0)
#>  fs             1.5.2      2021-12-08 [1] CRAN (R 4.2.0)
#>  generics       0.1.3      2022-07-05 [1] CRAN (R 4.2.0)
#>  ggplot2        3.3.6      2022-05-03 [1] CRAN (R 4.2.0)
#>  ggridges       0.5.3      2021-01-08 [1] CRAN (R 4.2.0)
#>  glue           1.6.2      2022-02-24 [1] CRAN (R 4.2.0)
#>  gridExtra      2.3        2017-09-09 [1] CRAN (R 4.2.0)
#>  gtable         0.3.1      2022-09-01 [1] CRAN (R 4.2.0)
#>  gtools         3.9.3      2022-07-11 [1] CRAN (R 4.2.0)
#>  highr          0.9        2021-04-16 [1] CRAN (R 4.2.0)
#>  htmltools      0.5.3      2022-07-18 [1] CRAN (R 4.2.0)
#>  htmlwidgets    1.5.4      2021-09-08 [1] CRAN (R 4.2.0)
#>  httpuv         1.6.6      2022-09-08 [1] CRAN (R 4.2.1)
#>  igraph         1.3.4      2022-07-19 [1] CRAN (R 4.2.0)
#>  inline         0.3.19     2021-05-31 [1] CRAN (R 4.2.0)
#>  insight        0.18.2.5   2022-08-15 [1] https://easystats.r-universe.dev (R 4.2.1)
#>  knitr          1.40       2022-08-24 [1] CRAN (R 4.2.0)
#>  later          1.3.0      2021-08-18 [1] CRAN (R 4.2.0)
#>  lattice        0.20-45    2021-09-22 [1] CRAN (R 4.2.1)
#>  lifecycle      1.0.2      2022-09-09 [1] CRAN (R 4.2.1)
#>  lme4           1.1-30     2022-07-08 [1] CRAN (R 4.2.0)
#>  loo            2.5.1      2022-03-24 [1] CRAN (R 4.2.0)
#>  magrittr       2.0.3      2022-03-30 [1] CRAN (R 4.2.0)
#>  markdown       1.1        2019-08-07 [1] CRAN (R 4.2.0)
#>  MASS           7.3-58.1   2022-08-03 [1] CRAN (R 4.2.0)
#>  Matrix         1.4-1      2022-03-23 [1] CRAN (R 4.2.1)
#>  matrixStats    0.62.0     2022-04-19 [1] CRAN (R 4.2.0)
#>  mime           0.12       2021-09-28 [1] CRAN (R 4.2.0)
#>  miniUI         0.1.1.1    2018-05-18 [1] CRAN (R 4.2.1)
#>  minqa          1.2.4      2014-10-09 [1] CRAN (R 4.2.0)
#>  munsell        0.5.0      2018-06-12 [1] CRAN (R 4.2.0)
#>  mvtnorm        1.1-3      2021-10-08 [1] CRAN (R 4.2.0)
#>  nlme           3.1-159    2022-08-09 [1] CRAN (R 4.2.0)
#>  nloptr         2.0.3      2022-05-26 [1] CRAN (R 4.2.0)
#>  parameters     0.18.2     2022-08-10 [1] CRAN (R 4.2.0)
#>  performance    0.9.2      2022-08-10 [1] CRAN (R 4.2.0)
#>  pillar         1.8.1      2022-08-19 [1] CRAN (R 4.2.1)
#>  pkgbuild       1.3.1      2021-12-20 [1] CRAN (R 4.2.0)
#>  pkgconfig      2.0.3      2019-09-22 [1] CRAN (R 4.2.0)
#>  plyr           1.8.7      2022-03-24 [1] CRAN (R 4.2.0)
#>  prettyunits    1.1.1      2020-01-24 [1] CRAN (R 4.2.0)
#>  processx       3.7.0      2022-07-07 [1] CRAN (R 4.2.0)
#>  promises       1.2.0.1    2021-02-11 [1] CRAN (R 4.2.0)
#>  ps             1.7.1      2022-06-18 [1] CRAN (R 4.2.0)
#>  purrr          0.3.4      2020-04-17 [1] CRAN (R 4.2.0)
#>  R.cache        0.16.0     2022-07-21 [1] CRAN (R 4.2.0)
#>  R.methodsS3    1.8.2      2022-06-13 [1] CRAN (R 4.2.0)
#>  R.oo           1.25.0     2022-06-12 [1] CRAN (R 4.2.0)
#>  R.utils        2.12.0     2022-06-28 [1] CRAN (R 4.2.0)
#>  R6             2.5.1      2021-08-19 [1] CRAN (R 4.2.0)
#>  Rcpp         * 1.0.9      2022-07-08 [1] CRAN (R 4.2.0)
#>  RcppParallel   5.1.5      2022-01-05 [1] CRAN (R 4.2.0)
#>  report       * 0.5.5.1    2022-09-09 [1] Github (easystats/report@ae6ad14)
#>  reprex         2.0.2      2022-08-17 [1] CRAN (R 4.2.0)
#>  reshape2       1.4.4      2020-04-09 [1] CRAN (R 4.2.0)
#>  rlang          1.0.5      2022-08-31 [1] CRAN (R 4.2.0)
#>  rmarkdown      2.16       2022-08-24 [1] CRAN (R 4.2.0)
#>  rstan          2.21.7     2022-09-08 [1] CRAN (R 4.2.1)
#>  rstanarm     * 2.21.3     2022-04-09 [1] CRAN (R 4.2.0)
#>  rstantools     2.2.0      2022-04-08 [1] CRAN (R 4.2.0)
#>  rstudioapi     0.14       2022-08-22 [1] CRAN (R 4.2.0)
#>  scales         1.2.1      2022-08-20 [1] CRAN (R 4.2.1)
#>  sessioninfo    1.2.2      2021-12-06 [1] CRAN (R 4.2.0)
#>  shiny          1.7.2      2022-07-19 [1] CRAN (R 4.2.0)
#>  shinyjs        2.1.0      2021-12-23 [1] CRAN (R 4.2.0)
#>  shinystan      2.6.0      2022-03-03 [1] CRAN (R 4.2.0)
#>  shinythemes    1.2.0      2021-01-25 [1] CRAN (R 4.2.0)
#>  StanHeaders    2.21.0-7   2020-12-17 [1] CRAN (R 4.2.0)
#>  stringi        1.7.8      2022-07-11 [1] CRAN (R 4.2.0)
#>  stringr        1.4.1      2022-08-20 [1] CRAN (R 4.2.0)
#>  styler         1.7.0.9001 2022-08-07 [1] Github (r-lib/styler@a3b69e4)
#>  survival       3.4-0      2022-08-09 [1] CRAN (R 4.2.0)
#>  threejs        0.3.3      2020-01-21 [1] CRAN (R 4.2.0)
#>  tibble         3.1.8      2022-07-22 [1] CRAN (R 4.2.0)
#>  tidyselect     1.1.2      2022-02-21 [1] CRAN (R 4.2.0)
#>  utf8           1.2.2      2021-07-24 [1] CRAN (R 4.2.0)
#>  vctrs          0.4.1      2022-04-13 [1] CRAN (R 4.2.0)
#>  withr          2.5.0      2022-03-03 [1] CRAN (R 4.2.0)
#>  xfun           0.32       2022-08-10 [1] CRAN (R 4.2.0)
#>  xtable         1.8-4      2019-04-21 [1] CRAN (R 4.2.0)
#>  xts            0.12.1     2020-09-09 [1] CRAN (R 4.2.0)
#>  yaml           2.3.5      2022-02-21 [1] CRAN (R 4.2.0)
#>  zoo            1.8-10     2022-04-15 [1] CRAN (R 4.2.0)
#> 
#>  [1] /Library/Frameworks/R.framework/Versions/4.2/Resources/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

fschaffner avatar Sep 09 '22 13:09 fschaffner

Thanks for the report.

Can you please try with the development version of the package from GitHub and see if the issue is still observed?

I think this has been resolved already by @etiennebacher.

IndrajeetPatil avatar Sep 09 '22 13:09 IndrajeetPatil

I think this has been resolved already by @etiennebacher.

Nope, I can reproduce. The problem actually comes from report_model:

library(report)
library(rstanarm)
#> Loading required package: Rcpp
#> This is rstanarm version 2.21.3
#> - See https://mc-stan.org/rstanarm/articles/priors for changes to default priors!
#> - Default priors may change, so it's safest to specify priors, even if equivalent to the defaults.
#> - For execution on a local, multicore CPU with excess RAM we recommend calling
#>   options(mc.cores = parallel::detectCores())

model <- stan_glm(mpg ~ qsec + wt, data = mtcars, refresh = 0)
report_model(model)
#> Bayesian linear model (estimated using MCMC sampling with 4 chains of 2000 iterations and a warmup of 1000) to predict mpg with qsec and wt (formula: mpg ~ qsec + wt). Priors over parameters were set as normal (mean = 0.00, SD = 8.43) distributions
#> Bayesian linear model (estimated using MCMC sampling with 4 chains of 2000 iterations and a warmup of 1000) to predict mpg with qsec and wt (formula: mpg ~ qsec + wt). Priors over parameters were set as normal (mean = 0.00, SD = 15.40) distributions

Created on 2022-09-09 with reprex v2.0.2

etiennebacher avatar Sep 09 '22 14:09 etiennebacher

I see. I thought this was similar to the one fixed in https://github.com/easystats/report/pull/280.

IndrajeetPatil avatar Sep 09 '22 14:09 IndrajeetPatil

Yes, I did try it with the development version and the bug persists. I think @etiennebacher fixed it for glm() models only.

fschaffner avatar Sep 09 '22 14:09 fschaffner

Update: it comes from report_priors(model)

etiennebacher avatar Sep 09 '22 14:09 etiennebacher

These two lines produce two different values for Prior_Scale, which then duplicate each text. I don't know what the text output should be so I can't fix it, but someone else probably can: https://github.com/easystats/report/blob/ae6ad14897083fd242a145457d70186f58b7870f/R/report.stanreg.R#L62-L63

library(report)
library(rstanarm)
#> Loading required package: Rcpp
#> This is rstanarm version 2.21.3
#> - See https://mc-stan.org/rstanarm/articles/priors for changes to default priors!
#> - Default priors may change, so it's safest to specify priors, even if equivalent to the defaults.
#> - For execution on a local, multicore CPU with excess RAM we recommend calling
#>   options(mc.cores = parallel::detectCores())

model <- stan_glm(mpg ~ qsec + wt, data = mtcars, refresh = 0)
params <- bayestestR::describe_prior(model)
params <- params[params$Parameter != "(Intercept)", ]
params
#>   Parameter Prior_Distribution Prior_Location Prior_Scale
#> 2      qsec             normal              0    8.431924
#> 3        wt             normal              0   15.399106

Created on 2022-09-09 with reprex v2.0.2

etiennebacher avatar Sep 09 '22 14:09 etiennebacher

I'm also experiencing the same problem with report producing output 3x. Any help will be greatly appreciated.

Thanks, T.

onwutochs avatar Sep 13 '22 23:09 onwutochs

Yes, I was checking to see if there were any Issues open about this. I've got 10 duplicated summaries for one brms model with only 1 fixed effect and two random effects. This is unexpected as previously the report() worked properly and gave a single summary.

edit This behaviour is the same with the version from GitHub main branch as well.

adamhsparks avatar Oct 13 '22 23:10 adamhsparks

I am experiencing the same issue with a simple lm() model. The report() and report_model() functions produce output 3 times.

aroberts394 avatar Oct 26 '22 03:10 aroberts394

Thanks for pinpointing this issue @etiennebacher ! I was hoping to be able to unblock this issue for the report paper since our demo does duplicate the issue. But I've blocked at the same place as you. As I have no experience with Bayesian models, I'm also not sure which line should be kept from:

> params[params$Parameter != "(Intercept)", ]
  Parameter Prior_Distribution Prior_Location Prior_Scale
2      qsec             normal              0    8.431924
3        wt             normal              0   15.399106

It seems this leads to having two models in report_model, with the only difference being the SD.

> report_model(x)
[...] Priors over parameters were set as normal (mean = 0.00, SD = 8.43) distributions
[...] Priors over parameters were set as normal (mean = 0.00, SD = 15.40) distributions

So it's not a TOTAL duplicate of the report output, more like it's trying to report two separate (but almost identical) models (one for each predictor/IV). If you check the original output from report() carefully, it matches the text above.

So I think the way about fixing this issue is to clarify and decide first whether it should report two models, or only one. If one, which one of the two lines above should be kept? If two, then perhaps we can just make sure they are clearly split in separate paragraphs?

Since this is Bayesian stuff, I'm afraid we'll have to ask @DominiqueMakowski for a bit of advice here... 😬

rempsyc avatar Mar 05 '23 22:03 rempsyc

What I don't understand is how does the Prior Scale value translates to standard deviation (SD) in the text? And right now report_text.stanreg is defined as report_text.lm. Do we need to have report_text.lm have its own report.text method to account for this situation? Or can it be resolved more simply?

The critical sentence is

Priors over parameters were set as normal (mean = 0.00, SD = 8.43) distributions.

So a Bayes question would be: normally when priors over parameters are set as normal, is that set for each predictor?

If so should we have the output only once, but report the mean and SD of ALL predictors? Something like:

Priors over parameters were set as normal (qsec mean = 0.00, SD = 8.43; wt mean = 0.00, SD = 15.40) distributions.

rempsyc avatar Mar 05 '23 22:03 rempsyc

If going for the last option, it can be solved with a simple collapse by adding this on line 83: values <- paste0(values, collapse = "; "). Result:

devtools::load_all("c:/github/report")
library(rstanarm)
x <- stan_glm(mpg ~ qsec + wt, data = mtcars, refresh = 0, iter = 500)
report(x)

#> We fitted a Bayesian linear model (estimated using MCMC sampling with 4 chains
#> of 500 iterations and a warmup of 250) to predict mpg with qsec and wt
#> (formula: mpg ~ qsec + wt). Priors over parameters were all set as normal (mean
#> = 0.00, SD = 8.43; mean = 0.00, SD = 15.40) distributions. The model's
#> explanatory power is substantial (R2 = 0.81, 95% CI [0.69, 0.88], adj. R2 =
#> 0.79). The model's intercept, corresponding to qsec = 0 and wt = 0, is at 19.56
#> (95% CI [8.34, 31.33]). Within this model:
#> 
#>   - The effect of qsec (Median = 0.94, 95% CI [0.35, 1.52]) has a 99.90%
#> probability of being positive (> 0), 98.20% of being significant (> 0.30), and
#> 0.20% of being large (> 1.81). The estimation successfully converged (Rhat =
#> 1.002) but the indices are unreliable (ESS = 867)
#>   - The effect of wt (Median = -5.04, 95% CI [-6.01, -4.11]) has a 100.00%
#> probability of being negative (< 0), 100.00% of being significant (< -0.30),
#> and 100.00% of being large (< -1.81). The estimation successfully converged
#> (Rhat = 1.004) but the indices are unreliable (ESS = 992)
#> 
#> Following the Sequential Effect eXistence and sIgnificance Testing (SEXIT)
#> framework, we report the median of the posterior distribution and its 95% CI
#> (Highest Density Interval), along the probability of direction (pd), the
#> probability of significance and the probability of being large. The thresholds
#> beyond which the effect is considered as significant (i.e., non-negligible) and
#> large are |0.30| and |1.81| (corresponding respectively to 0.05 and 0.30 of the
#> outcome's SD). Convergence and stability of the Bayesian sampling has been
#> assessed using R-hat, which should be below 1.01 (Vehtari et al., 2019), and
#> Effective Sample Size (ESS), which should be greater than 1000 (Burkner, 2017).

Created on 2023-03-05 with reprex v2.0.2

Would that be an acceptable solution for now? If so I can push a PR and we can always revisit later when we get confirmation from Dom.

(though I don't really understand why the means would be 0 there but the SDs so huge)

rempsyc avatar Mar 05 '23 22:03 rempsyc