ggdist icon indicating copy to clipboard operation
ggdist copied to clipboard

feature request: align dot bins

Open steveharoz opened this issue 4 years ago • 6 comments

When there are multiple dot histograms shown, misalignment can really stand out. It'd help to have a bit more control over the bin placement.

expand.grid(
  side=c("top", "bottom"), 
  y=LETTERS[1:2], 
  reps=1:30, 
  stringsAsFactors = FALSE
) %>% 
  mutate(x = rnorm(n())) %>% 
  ggplot() +
  aes(x=x, y=y, side=side, fill=y) +
  ggdist::stat_dots(binwidth=0.2)

image

geom_histogram() has a boundary parameter that sets the boundary of one bin, and the rest follow.

It'd be helpful to be able to use the same parameter to have more control over the binning: stat_dots(binwidth=0.2, boundary=0)

steveharoz avatar Sep 25 '21 11:09 steveharoz

Yeah something like this would be useful. Not sure the best way to do it, as the wilkinson algorithm does not have regularly spaced bins, so just setting a boundary would not cause all bins to line up. Would either have to determine bin positions globally by applying the wilkinson algorithm to the combined data (enh) or to allow histogram binning (more likely). So maybe a layout = "histogram" combined with something like the center and boundary parameters (would need to see if those can be made suitably generic... That brings back questions of what to do with the binning params for stat_histinterval which I would want to keep in sync with this before I start adding parameters willy nilly)

mjskay avatar Sep 25 '21 13:09 mjskay

Thinking more, since all methods except layout = "swarm" use a binning algorithm under the hood, probably makes the most sense to allow that to be set separate from layout. So something like bin = "wilkinson" or bin = "histogram", where center and boundary can be arguments of the histogram method (or maybe just make them something that gets passed down to the binning methods from the top level...).

mjskay avatar Jul 15 '23 00:07 mjskay

While I see how layout and bin might be different at the implementation level, I suspect someone would not want to use bin = "histogram" with layout = "hex". If they're mutually exclusive, it might be easiest to have it in one argument.

Regularly spaced binning seems like an alternative option to these layouts: layouts

steveharoz avatar Jul 15 '23 04:07 steveharoz

Hmm, in the hex case it would make it a regular hex grid instead of an irregular one, which might be desirable in some cases? I dunno.

I guess my other feeling is that there are some other alternative ways to do the binning for the "bin" layout, and if I eventually implement some of those I'm not sure I want to keep adding new layouts. They really are another part of the algorithm that can be swapped out.

mjskay avatar Jul 15 '23 05:07 mjskay

That makes sense. You probably have a better picture of all the possible layout*bin combinations than I do.

steveharoz avatar Jul 15 '23 05:07 steveharoz

Yeah, some of them are almost certainly not useful, and I get the user-facing argument for just making a different option on layout... but in the end I think the potential for other binning algorithms might convince me to make it a separate argument.

mjskay avatar Jul 15 '23 05:07 mjskay