cmdstanr icon indicating copy to clipboard operation
cmdstanr copied to clipboard

Errors re-rendering quarto html with cached chunks and cmdstanr.

Open danmyles opened this issue 1 year ago • 3 comments

Describe the bug

Quarto fails to re-render to html following trivial changes to code chunks that use cmdstanr objects.

processing file: 01_data-exclusions.qmd
  |............................................... |  98% [model_1_postsum]    
Quitting from lines  at lines 616-637 [model_1_postsum] (01_data-exclusions.qmd)
Error in `read_cmdstan_csv()`:
! Assertion on 'files' failed: File does not exist: '/var/folders/ms/m90tgryx74362fqn7d65tl7w0000gn/T/Rtmp5yoGqf/model_8cda69ac3b06c9b214d4a12ede85f4f8-202407191622-1-7cf94c.csv'.
Backtrace:
 1. fit01$draws(variables = c("a_bar", "sigma", "p"), format = "df")
 2. private$read_csv_(...)
 3. cmdstanr::read_cmdstan_csv(...)
 4. cmdstanr:::assert_file_exists(files, access = "r", extension = "csv")
 5. checkmate::makeAssertion(files, res, .var.name, add)
 6. checkmate:::mstop(...)
                                                                                                                
Execution halted

To Reproduce

Turn on Quarto cache in the header:

...
execute:
   cache: true
...

Read in a stan model from file using the chunk options:

```{cmdstan file = "path/to/stan/model.stan", output.var = 'm01'}
``

Sample:

``` {r}
fit01 <- m01$sample( [...] )
``

Write some code to summarise the model output or whatever.

Render to html.

Make a small change to anything that calls the model fit object fit01. In my case it will happen if I just add an empty line after an unedited chunk.

Try to render to html again.

Error occurs.

Expected behavior The document is rendered after making small changes without the need to clear the cache and refit all models etc,

Operating system MacOs 14.0

CmdStanR version number 0.7.1

Additional context None

danmyles avatar Jul 19 '24 06:07 danmyles

This is because cmdstanr will use the R session temporary directory for the sampling outputs by default, when you re-render the temporary directory may have changed and so the sampling results are no longer there.

If you want to reuse the fit object across multiple runs then I'd suggest using the output_dir argument of $sample() so that the sampling results are always in an expected location:

fit01 <- m01$sample( [...], output_dir = "path/to/stanmodels")

Also, you should update your cmdstanr package, the current version is 0.8.1

andrjohns avatar Jul 19 '24 07:07 andrjohns

@jgabry any chance you're familiar with quarto and caching? I wonder if there's a specific change we need to make to enable CmdStanFit objects to be cached with the draws (maybe some R6 trickery?)

andrjohns avatar Jul 19 '24 07:07 andrjohns

@jgabry any chance you're familiar with quarto and caching? I wonder if there's a specific change we need to make to enable CmdStanFit objects to be cached with the draws (maybe some R6 trickery?)

So the issue is that the draws aren't in memory when the object is cached, right?

I think quarto relies on knitr to do the caching, and if I'm looking in the right place in knitr's source code it looks like knitr just uses base R's save function:

https://github.com/yihui/knitr/blob/fb1f4231d2df7b156e314686f39fc040ce807513/R/cache.R#L30

That's unfortunate because I was hoping we could just define a custom method for whatever function knitr was using to do the caching. But save isn't a generic function with methods so I don't think that's possible. Internally save calls functions like gzfile, xzfile, etc., and I don't think we can define custom methods for those either.

Maybe there's some other approach that I'm overlooking!

jgabry avatar Jul 19 '24 18:07 jgabry