rmarkdown df_print and cached chunks

df_print and cached chunks

Open netique opened this issue 3 years ago • 1 comments

Hi,

this gave me a few head-scratching moments: When I use html_document that has been knitted and the results cached (I mean knitr::opts_chunk(cache = TRUE)), then when I decide to show paged tables using

output:
  html_document:
    df_print: paged

in YAML header, the result keeps rendering as verbatim text output (forgive me the {shiny} lingo).

Now I regard this as obvious, but it is in fact the second time already that I have been solving this "issue". I believe it could be hard for {knitr} and {rmarkdown} to resolve the df_print with a cached output since the usage of methods and classes that are inherent to the very cached output, but maybe it is worth documenting this behavior or raising some friendly warning. What do you think?

Session info

R version 4.1.2 (2021-11-01)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22000), RStudio 2021.9.1.372

Locale:
  LC_COLLATE=Czech_Czechia.1250  LC_CTYPE=Czech_Czechia.1250    LC_MONETARY=Czech_Czechia.1250
  LC_NUMERIC=C                   LC_TIME=Czech_Czechia.1250    

Package version:
  base64enc_0.1.3 digest_0.6.29   evaluate_0.14   fastmap_1.1.0   glue_1.6.0     
  graphics_4.1.2  grDevices_4.1.2 highr_0.9       htmltools_0.5.2 jquerylib_0.1.4
  jsonlite_1.7.2  knitr_1.37      magrittr_2.0.1  methods_4.1.2   rlang_0.4.12   
  rmarkdown_2.11  stats_4.1.2     stringi_1.7.6   stringr_1.4.0   tinytex_0.36   
  tools_4.1.2     utils_4.1.2     xfun_0.29       yaml_2.2.1     

Pandoc version: 2.14.0.3

Checklist

When filing a bug report, please check the boxes below to confirm that you have provided us with the information we need. Have you:

[x] formatted your issue so it is easier for us to read?
[x] included a minimal, self-contained, and reproducible example?
[x] pasted the output from xfun::session_info('rmarkdown') in your issue?
[x] upgraded all your packages to their latest versions (including your versions of R, the RStudio IDE, and relevant R packages)?
[x] installed and tested your bug with the development version of the rmarkdown package using remotes::install_github("rstudio/rmarkdown")?

Feb 15 '22 20:02 netique

Thanksfor the suggestion.

We have some documentation and generic advices in the R Markdown Cookbook https://bookdown.org/yihui/rmarkdown-cookbook/cache.html

Among them:

The most appropriate use case of caching is to save and reload R objects that take too long to compute in a code chunk, and the code does not have any side effects, such as changing global R options via options() (such changes will not be cached). If a code chunk has side effects, we recommend that you do not cache it.

We do not recommend that you set the chunk option cache = TRUE globally in a document. Caching can be fairly tricky. Instead, we recommend that you enable caching only on individual code chunks that are surely time-consuming and do not have side effects.

Following this documentation, a Rmd that process data, and prints a table should be that way

---
title: "test"
output:
  html_document:
    df_print: paged
---

```{r, message=FALSE, warning=FALSE}
library(dplyr)
```
Le'ts get the droids name and their homeworld
```{r data, cache = TRUE}
droids <- starwars %>% filter(species == "Droid") %>% select(name, homeworld) %>% distinct()
```

```{r}
droids
```

Meaning that the table rendering / printing should not be in a cache chunk. That way the printing method (which is a side effect somehow) will correctly apply.

We could document specifically for df_print, but really this will be the case with any external generic config (here changing df_print YAML) that should apply on the output of a cached chunk. Caching means the chunk is not recomputed and result is loaded - changing an external config won't invalid the cache, unless it is explicitly set in cache.extra option;

Anyway, I just wanted to clarify. I'll mark this as doc improvment - thanks for the suggestion !

Feb 16 '22 10:02 cderv

rmarkdown rmarkdown copied to clipboard

df_print and cached chunks

Checklist

rmarkdown
rmarkdown copied to clipboard