kableExtra icon indicating copy to clipboard operation
kableExtra copied to clipboard

General spec_plotfun function

Open haozhu233 opened this issue 4 years ago • 4 comments

@r2evans made a great suggestion that we should create a general spec_plot function that allows users to feed in customized plotting functions. Here is the issue to track the progress and manage the discussions on this topic.

haozhu233 avatar Aug 31 '20 03:08 haozhu233

Sorry, I probably made the PR a bit more detailed than it needed to be. I'll copy the examples from there so that we can discuss it here.

Using the spec_plot as defined in #505 and this code:

mpg_list <- split(mtcars$mpg, mtcars$cyl)
inline_plot <- data.frame(cyl = c(4, 6, 8), mpg_box = "", mpg_hist = "",
                          spec_line = "", small_lines = "", small_lines_gg = "")
inline_plot %>% 
  kbl(booktabs = T) %>%
  kable_paper(full_width = F) %>%
  column_spec(2, image = spec_boxplot(mpg_list)) %>%
  column_spec(3, image = spec_hist(mpg_list)) %>%
  column_spec(4, image = spec_line(mpg_list, same_lim = FALSE)) %>%
  column_spec(5, image = spec_plot(small_lines(), mpg_list)) %>%
  column_spec(6, image = spec_plot(small_lines_gg(), mpg_list))

we can generate

image

image

and take it to a bit more complicated with:

set.seed(42)
x <- cumsum(rnorm(20))

somefunc <- function(z) {
  dat <- data.frame(
    x = seq_along(z),
    y = z,
    mu = zoo::rollapply(z, 9, mean, partial = TRUE),
    sigma = 12*zoo::rollapply(z, 9, sd, partial = TRUE)
  )
  graphics::par(mar = c(0, 0, 0.2, 0), lwd=0.5)
  plot(NA, type = "n", xaxt = "n", yaxt = "n", ann = FALSE, frame.plot = FALSE,
       xlim = c(1, length(z)), ylim = range(z) + c(-1,1)*max(dat$sigma))
  polygon(c(dat$x, rev(dat$x), dat$x[1]),
          c(dat$mu + dat$sigma, rev(dat$mu - dat$sigma), dat$mu[1] + dat$sigma[1]),
          col = "gray80", lwd = 0.1)
  lines(mu~x, data=dat, col = "red", lwd = 0.2)
}
  
data.frame(a="quux", b="") %>%
  kbl(booktabs = T) %>%
  kable_paper(full_width = F) %>%
  column_spec(2, image = spec_plot(somefunc, list(x)))

image

r2evans avatar Aug 31 '20 22:08 r2evans

Some concerns I have, looking for your comments:

  1. the technique that allows spec_plot to deal with both base graphics and ggplot2 graphics is that if there's a return object from the UDF that inherits "ggplot", then we discard the already-attempted grDevices::svg-saved image (or png), because ggplot2 performs inconsistently (at least with svg). To get it to work, I get the ggplot2::ggsave function with

    ggsave <- get("ggsave", envir=as.environment("package:ggplot2"))
    

    so that we don't necessarily trigger a CRAN-complaint about not Importing or Depends on ggplot2. The reason I don't go so far as to requireNamespace("ggplot2") is that if we are collecting a graphic of class "ggplot", I find it highly unlikely that we need to try hard to load the namespace and fail if not found. While I can contrive of an example where a system returns a class "ggplot"-object without have ggplot2-package available, it is ... obscure.

  2. I added ggplot2 to Suggests:, since this package will be able to take advantage of it if present. Your other option is to add ggplot2 to Imports: and we can access ggsave with the standard double-colon notation without CRAN complaining ... but I was not assuming that you wanted to add that otherwise-optional package as an installation prerequisite.

  3. I knowingly run the inefficiency of calling svg(); ...; dev.off() and then overwriting with ggsave, since we don't know until the plot function is complete whether we can infer it is ggplot-like. Since this is not likely to surprise the user, we could certainly require spec_plot(func, mpg_list, isggplot=TRUE) or similar, and remove the double-tap on ::svg(). I don't think it's a problem, frankly, other than a little extra processing.

  4. I think I am handling ggsave calls correctly. I am not proficient at all of the units= and such that it infers/requires, so I found a combination that works. Oddly, I found I must use device=grDevices::svg but cannot use device=grDevices::png, failing for width/height problems. There might be a better way to handle this in the if (inherits(thisplot, "ggplot")) block.

  5. I added the use of on.exit(dev.off(curdev)) (also in spec_line) because during testing, I often interrupted things and had a stale graphics device sitting around. Calling dev.off(curdev) multiple times (on the same dev) is harmless and repeats are a no-op. For consistency, it might be good to add this to spec_boxplot and spec_hist as well.

And some discussion points on a separate vignette on UDFs:

  • "Static" functions are pretty straight-forward, as in the somefunc above;

  • "Flexible" functions are really just functions that return functions ... as in my small_lines example. It allows one to have a straight-forward UDF that is extensible. small_lines() returns a function (with formals x,y), and it utilizes all of the defaults. One can override certain defaults, ala

    ... %>%
      column_spec(2, image = spec_plot(small_lines(              ), mpg_list)) %>%
      column_spec(3, image = spec_plot(small_lines(col = "green4"), mpg_list))
    
  • besides example functions in a vignette (I think a separate one might be in order), do you think we should add arguments to spec_plot that starts with a compliant plot(...) (margins set, etc) and allows the UDF to just add lines, polygons, etc? Similarly, ggplot-analog with theme already appended? I fear over-complexity of spec_plot and generally recommend against it, especially since I believe that your spec_{hist,boxplot,line} functions are really meant to be straight-forward/simple and safe.

  • another alternative I considered (and discarded temporarily for simplicity/time) was using expressions instead of functions, perhaps similar to rlang, perhaps just unevaluated (substituted) calls with placeholders for the actual values. I don't know that this adds any capability over the "simple" and "flexible" functions above, and it does add a little complexity ... I'd think the only would be that it would change from these

    ... %>%
      column_spec(2, image = spec_plot(somefunc, mpg_list)) %>%
      column_spec(3, image = spec_plot(function(x) somefunc(x, col="green4"), mpg_list))
    

    to these

    ... %>%
      column_spec(2, image = spec_plot(somefunc(.x), mpg_list)) %>%
      column_spec(3, image = spec_plot(somefunc(.x, col="green4"), mpg_list))
    

    (Perhaps a little more readable.) The risk is that if you have plots requiring multiple list arguments, it then because purrr-like, ala

    ... %>%
      column_spec(2, image = spec_plot(somefunc(.x, .y, .z), list1, list2, list3)) %>%
      column_spec(3, image = spec_plot(somefunc(..1, ..2, ..3, col="green4"), list1, list2, list3))
    

r2evans avatar Aug 31 '20 23:08 r2evans

I tried running the earlier somefunc code (but using pipe |>) and got Error in min(x, na.rm = na.rm) : invalid 'type' (list) of argument. Is this general "somefunc" approach implemented?

harrelfe avatar Mar 31 '23 22:03 harrelfe

@harrelfe I don't think so, unfortunately. I think all (most, at least) of my use-cases are currently covered by the other spec_* functions, so I slowed development on it and moved on. What's your use-case where you'd prefer to use a UDF for plotting?

r2evans avatar Apr 04 '23 20:04 r2evans