stat_slab/stat_eye: limits argument ignored
In practice, many distributions are far from normal and highly skewed. A couple years ago I started a discussion on plotting such "difficult" distributions in ggplot2. Often the tail is not of interest and if the entire distribution is plotted the mode is obscured, so we want to zoom into the region of interest.
In ggplot2::geom_density there are two options: (1) setting coordinate limits and increasing n above 512 or (2) setting scale limits and oob = scales::oob_keep. The second solution is far more elegant because computationally less intensive while preserving all samples underlying the distribution. Option 1 works for all density functions I know, except geom_violin which has no n argument, but is computationally intensive for large tails and in some cases not workable. Option 2 does not work for any ggdist function.
In ggridges::geom_density_ridges option 2 is possible by setting the from and to arguments to the same limits as the scale limits. I assumed the limits argument in ggdist::stat_slab and ggdist::stat_eye would do the same, but it is ignored. This was already noted by @teunbrand in the linked discussion.
Here is a MRE:
tibble(
x = c( rgamma( 1e5 , 1^2 / 3^2 , 1 / 3^2 ),
rgamma( 1e5 , 2^2 / 3^2 , 2 / 3^2 ) ),
group = rep(c("a", "b"), each = 1e5)
) %>%
ggplot() +
geom_density(aes(x = x, colour = group)) + # comment out as needed
geom_violin(aes(x = x, y = group)) +
geom_density_ridges(aes(x = x, y = group),
from = 0, to = 1) +
stat_slab(aes(x = x, y = group), height = 2,
limits = c(0, 1)) +
stat_eye(aes(x = x, y = group),
limits = c(0, 1), point_interval = NULL) +
scale_x_continuous(limits = c(0, 1),
oob = scales::oob_keep) +
theme_minimal()
I love ggdist and would like to fully transition away from ggridges and native ggplot2 density functions, but this issue is stopping me from doing so.
I have also opened a SO discussion. This issue has been noticed by various other users, so is a popular feature request.
Yeah good point. This will require adjusting density_bounded() and density_unbounded() to use limits to set the evaluation points.
Note to self: density_bounded()'s reflection method will have to be refactored a bit to make this work: it will need to be a bit more clever than it currently is when bounds is not fully contained within limits. Possibly the most sensible implementation will involve running the estimator separately for the (up to 2) additional regions outside the bounds since I don't think base::density() accepts a non-contiguous set of evaluation points (and transforming to a contiguous set then cutting later just brings us back to the same problem of needing a large number of evaluation points).