bayesplot icon indicating copy to clipboard operation
bayesplot copied to clipboard

Missing counts in ppc_bars_yrep_data()

Open famuvie opened this issue 4 years ago • 0 comments

Hi,

I think I identified a bug leading to incorrect posterior predictive counts of discrete outcomes under some edge values. Find below a minimal reproducible example.

Thanks for such an excellent package. I hope this helps. ƒacu.-

library(bayesplot)
#> This is bayesplot version 1.8.0
#> - Online documentation and vignettes at mc-stan.org/bayesplot
#> - bayesplot theme set to bayesplot::theme_default()
#>    * Does _not_ affect other ggplot2 plots
#>    * See ?bayesplot_theme_set for details on theme setting
library(tidyverse)

y <- c(1, 1, rep(0, 3))
yrep <- matrix(c(y, rep(0, 9*5)), ncol = 5, byrow = TRUE)

ppc_bars(y, yrep)


## The median of y_rep at 1 is 2
## while it should be at 0:
n_ones_rep <- colSums(t(yrep))
(median(n_ones_rep))
#> [1] 0

## The issue arises within bayesplot:::ppc_bars_yrep_data(),
## which does something like this:
yrep_tidy <- 
  yrep %>% 
  as_tibble(
    .name_repair = ~paste(
      "y", seq.int(length(.)),
      sep = "_"
    )
  ) %>% 
  rowid_to_column(var = "rep") %>% 
  pivot_longer(
    cols = -rep,
    names_to = "obs",
    names_prefix = "y_",
    values_to = "value"
  )

yrep_tidy %>% 
  count(value, rep) %>%
  group_by(value) %>%
  summarise(mn = median(n))
#> # A tibble: 2 x 2
#>   value    mn
#>   <dbl> <dbl>
#> 1     0     5
#> 2     1     2

## While it should do:
yrep_tidy %>% 
  count(value, rep) %>%
  complete(value, rep, fill = list(n = 0)) %>% 
  group_by(value) %>%
  summarise(mn = median(n))
#> # A tibble: 2 x 2
#>   value    mn
#>   <dbl> <dbl>
#> 1     0     5
#> 2     1     0

Created on 2021-04-12 by the reprex package (v0.3.0.9001)

famuvie avatar Apr 12 '21 12:04 famuvie