ggdist icon indicating copy to clipboard operation
ggdist copied to clipboard

point_interval is inefficient compared to summarise

Open lukaseamus opened this issue 1 month ago • 0 comments

I routinely run my models with 8 chains and 1e4 iterations per chain. For continuous predictions I then compute predictive distributions for each predictor value in a sequence of 100–200. This is often necessary to correctly represent nonlinear predictions.

I would like to use the point_interval functions to summarise my predictions, but have run into performance issues. Once the number of groups exceed a certain limit, point_interval crashes R. This is not the case for a summarise call that uses the interval functions and yields exactly the same output.

For example,

prediction %>%
  group_by(group1, group2, group3) %>%
  median_qi(var1, var2, var3, .width = c(.5, .8, .9))

fails where

prediction %>%
  group_by(group1, group2, group3) %>%
  summarise(
    across(
      c(var1, var2, var3),
      list(median = median, 
            lower_0.5 = ~ qi(.x, .width = .5)[1],
            upper_0.5 = ~ qi(.x, .width = .5)[2],
            lower_0.8 = ~ qi(.x, .width = .8)[1],
            upper_0.8 = ~ qi(.x, .width = .8)[2],
            lower_0.9 = ~ qi(.x, .width = .9)[1],
            upper_0.9 = ~ qi(.x, .width = .9)[2]),
      .names = "{.col}.{.fn}"
    )
  ) %>%
  ungroup() %>%
  rename(var1 = var1.median, var2 = var2.median, var3 = var3.median) %>%
  pivot_longer(cols = contains("lower") | contains("upper")) %>%
  separate(col = name, into = c("name", ".width"), sep = "_(?=[^_]*$)") %>%
  pivot_wider(names_from = name, values_from = value)

doesn't.

I really appreciate the succinctness of the point_interval functions, but find myself having to resort to my own summarise functions. Perhaps you could rethink this family of functions to make them as efficient as summarise?

lukaseamus avatar Dec 07 '25 07:12 lukaseamus