bayesplot
bayesplot copied to clipboard
Expose functions that return data used for plotting
Suggested by @tjmahr. Currently the ggplot objects returned by bayesplot can be modified using many of the ggplot2 functions but some things are hard or impossible to change once the object has been created. Providing access to some of the functions that prepare the data would give users much more flexibility.
These functions will have names ending in _data. Several have already been added in version 1.4.0 and more will be added in future releases.
I've received a few questions about why we would expose the _data functions if the data is already stored in the ggplot object, so I'm adding some comments here to clarify a bit (and so I can point people here when they ask that question).
Here are two of the reasons for adding the _data functions:
- While it's true that the data is stored inside the ggplot object, only in simple cases is it convenient to retrieve the data from the ggplot object.
For example:
library("ggplot2")
g <- ggplot(mtcars, aes(wt, mpg)) + geom_point()
In this case g$data will be identical to the mtcars data frame. However, now consider the following example:
library("dplyr")
d1 <- mtcars %>% filter(wt > 3)
d2 <- mtcars %>% filter(wt <= 3)
g <- ggplot(d1, aes(wt, mpg)) +
geom_point(color = "purple") +
geom_point(data = d2, color = "green")
Now g$data is only identical to d1 and to get d2 we'd need to do something like g$layers[[2]]$data. In this simple example we could have just used mtcars and set aes(color = wt > 3) but for more complicated plots it can be useful to provide the user access to the data used for plotting in a convenient form rather than require the user to go on a scavenger hunt inside the ggplot object.
- In addition to being exposed to the user, the
_datafunctions will be used inside the bayesplot plotting functions. Currently everything that the_datafunctions will do exists inside the bayesplot plotting functions, so creating the_datafunctions is just a way of separating the data prep from the plotting, which will let us test these two parts separately. Even though there will technically be more functions to test, it doesn't actually add to the maintenance burden (if anything it probably reduces it).