distributional::dist_inflated doesn't work with stat_slabinterval
Two issues:
- distributional::dist_inflated doesn't work with stat_slabinterval
- halfeye/slab plot is not showing mixture of continuous and discrete distribution correctly. To replicate
# two zero inflated distributions, they should be equivalent
d1 = dist_mixture(dist_normal(0,0),dist_normal(0,1), weights = c(0.5,0.5))
d2 = dist_inflated(dist_normal(0,1),0.5,x=0)
df = tibble(
name = c("d1", "d2"),
dist = c(d1, d2)
)
# this works, but the inflated 0 is not properly shown
ggplot(df[1,], aes(y = name)) +stat_interval(aes(dist = dist)) + stat_halfeye(aes(dist = dist))
# this errors
ggplot(df[2,], aes(y = name)) +stat_interval(aes(dist = dist)) + stat_halfeye(aes(dist = dist))
#>
#>: Computation failed in `stat_slabinterval()`:
#> the condition has length > 1
(setting aside issue 2 for a moment; it is simply a bug and needs to be fixed)
I think fundamentally the issue with density plots in either case here is that the density is infinite at 0. Consequently it is kind of hard to plot as a single slab function. Some other solutions might be:
- Plotting it as a histogram. Currently stat_histinterval won't work here because it defaults to densities for analytical distributions unless they are determined to be discrete. Could adjust so that it can generate histograms as needed, or at least for discrete/continuous mixtures. Picking the binwidth will be a bit of an issue possibly best left to users, as the spike can be made arbitrarily tall with an arbitrarily small binwidth...
- Plotting the continuous and discrete portions of the distribution separately. This is probably a more accurate depiction of the distribution, but will require some finagling. Not sure the best way --- maybe a helper function to get the discrete and continuous portions of a mixture and allow those to be plotted on their own.
Probably either solution is useful in some cases, so a good "meta" solution might be to implement both of the above and when faced with discrete/continuous mixtures might be to raise a warning or error and point folks in the direction of either solution.
Yeah. i was thinking for mixture of pointmass with continuous distribution, maybe we need to sample from it first.