knitr icon indicating copy to clipboard operation
knitr copied to clipboard

catch error=TRUE chunks produced by hook_purl()

Open bastistician opened this issue 10 months ago • 14 comments

When a chunk is evaluated under error=TRUE, knitr catches errors as the code is declared to fail. The hook_purl()-produced R script, however, does not currently account for that.

Here is a minimal example (inspired by tests/testit/knit-handlers.Rmd):

setwd(tempdir())
writeLines(c('```{r test, error=TRUE}', '1 + ""', '```'), "knit-error.Rmd")
library(knitr)
knit_hooks$set(purl = hook_purl)
knit("knit-error.Rmd")  # does *not* fail

Using current knitr 1.46, this produces the following R script (knit-error.R):

## ----test, error=TRUE-----------------------------------------------------------------------------
1 + ""

Running that script fails, of course:

xfun::Rscript("knit-error.R")
#> Error in 1 + "" : non-numeric argument to binary operator
#> Execution halted

I think knitr could take advantage of the new R option "catch.script.errors" introduced in R 4.4.0. For example, for the above test chunk (with a local error=TRUE setting), I could imagine hook_purl() producing

## ----test, error = TRUE----------------------------------------------------------------------------
options(catch.script.errors = TRUE)

1 + ""

options(catch.script.errors = FALSE)

This would enable testing R scripts from vignettes that are shipped with packages (in inst/doc) even if they use error=TRUE chunks.

On a related note, eval=FALSE code chunks don't have this problem as hook_purl() already takes care of commenting such code: https://github.com/yihui/knitr/blob/44a7bee566afff3eb4c8b13fd6fa64487e12981f/R/hooks-extra.R#L150

bastistician avatar Apr 17 '24 11:04 bastistician

This bug makes all packages that have a vignette with an {r error=TRUE} chunk fail when running R CMD check with for example --no-build-vignettes (which invokes the purl functionality).

The problem is covered up by the fact that checking packages with _R_CHECK_VIGNETTES_SKIP_RUN_MAYBE_=TRUE or --as-cran avoids purling, and therefore is not observed on CRAN or most CI configurations.

jeroen avatar Apr 21 '24 13:04 jeroen

Sorry for the trouble. This change was intentional (see changelog), and CRAN made the request for change, after they changed _R_CHECK_VIGNETTES_SKIP_RUN_MAYBE_ to true in R.

It can be difficult to deal with these purl() problems case by case (eval=FALSE, error=TRUE, ...), since there can be many cases. For this particular case of error=TRUE, I think it's relatively simple for me to deal with, so I'll probably just do it (if we don't want to rely on the option catch.script.error in R 4.4.0, perhaps we can wrap the code in try()?).

BTW, you can also set the chunk option purl=FALSE to exclude specific code chunks from the purl() result, if the only goal is to avoid R CMD check errors. I understand that you may want to deliberately show the problematic code in the output script, though.

yihui avatar Apr 22 '24 13:04 yihui

Correction: _R_CHECK_VIGNETTES_SKIP_RUN_MAYBE_ has been changed to true in R-devel (and will appear in R 4.4.0). Previously I thought the change was only for CRAN and I was wrong.

For R < 4.4.0, this env var has to be set (or use --as-cran for R CMD check).

yihui avatar Apr 23 '24 21:04 yihui

To clarify, I see this issue more as a wishlist item than a bug in knitr.

A simpler solution for error=TRUE chunks might be to comment the code in the generated R script just like for eval=FALSE. Still I'm not 100% convinced of my proposal. Not least because vanilla knitr actually defaults to error=TRUE, but it wouldn't make sense to comment everything in the R script "just in case". So this needs to differentiate between a locally set error=TRUE and the global error=TRUE, which makes me feel uncomfortable (I don't know if there is a precedent for such behaviour in knitr).

BTW, you can also set the chunk option purl=FALSE to exclude specific code chunks from the purl() result, if the only goal is to avoid R CMD check errors.

Yes, knitr has long offered several options for the vignette author to deal with these problems on a case-by-case basis. I've used purl=FALSE several times. Personally, I like that installed packages (in their "doc" folders) ship an R script for each vignette, and of course I'd like a comprehensive check to ensure this R script runs without errors.

bastistician avatar Apr 23 '24 21:04 bastistician

So this needs to differentiate between a locally set error=TRUE and the global error=TRUE

Yes, that's the problem. The differentiation is only possible for tangle_block(): https://github.com/yihui/knitr/blob/f1788b81ffbe187be54a21fce32395fe0c77f2e5/R/block.R#L586-L587

but not possible for hook_purl() (which is used by the vignette engine to purl code): https://github.com/yihui/knitr/blob/f1788b81ffbe187be54a21fce32395fe0c77f2e5/R/hooks-extra.R#L135

so I'm afraid that purl=FALSE is the only possible way to exclude error=TRUE chunks...

yihui avatar May 24 '24 02:05 yihui

hi, data.table is also having this issue https://github.com/Rdatatable/data.table/pull/6220 we want to display error messages in rendered vignettes, but we are getting errors from CRAN checks. is the solution to use error=TRUE, purl=TRUE ? Thanks!

tdhock avatar Jul 07 '24 12:07 tdhock

i meant error=TRUE, purl=FALSE ?

tdhock avatar Jul 07 '24 12:07 tdhock

I am getting new ERRORs on CRAN now too from a chunk with error=TRUE, but only for r-oldrel-macos. Do you know if CRAN is still adapting things? Or should I do something about it? That chunk has been there for many years without issue.

Enchufa2 avatar Jul 07 '24 15:07 Enchufa2

@tdhock Yes, for R < 4.4.0, you need to use purl=FALSE to exclude code chunks that are known to throw errors, or set the env var _R_CHECK_VIGNETTES_SKIP_RUN_MAYBE_=true, or use R CMD check --as-cran.

@Enchufa2 In case I didn't make it clear enough above, R sets _R_CHECK_VIGNETTES_SKIP_RUN_MAYBE_=true since v4.4.0. For older versions of R, it's still false, which means R scripts tangled from vignettes are still checked, and error chunks will throw errors.

yihui avatar Jul 08 '24 15:07 yihui

@yihui There's something I don't quite understand or I see contradicting messages here. Please, correct me if I'm wrong, but:

  • According to this commit, _R_CHECK_VIGNETTES_SKIP_RUN_MAYBE_ has always been true for CRAN checks.
  • Now it's always true for R >= 4.4.0, but false for prior versions unless explicitly defined or --as-cran is set.

Therefore, with the most recent version of knitr,

  • We should see no failures with R >= 4.4.0.
  • We should see no failures with R < 4.4.0 if --as-cran is set.
  • We should see no failures with R < 4.4.0 on CRAN, because CRAN is supposed to check --as-cran.

Yet I see failures on CRAN for r-oldrel-macos, which I don't understand.

Enchufa2 avatar Jul 08 '24 16:07 Enchufa2

@Enchufa2 Your understanding is mostly correct, but I don't think CRAN uses --as-cran. If this flag is used, you will see it in the check log like this:

* ...
* using session charset: UTF-8
* using option ‘--as-cran’
* ...

but it's absent in CRAN's check log, e.g., https://www.r-project.org/nosvn/R.check/r-oldrel-macos-arm64/simmer-00check.html

yihui avatar Jul 08 '24 18:07 yihui

CRAN has recently changed the value of _R_CHECK_VIGNETTES_SKIP_RUN_MAYBE_ in their checks (either set or removed, I don't remember) but this may have changed the behavior for certain versions.

jeroen avatar Jul 08 '24 19:07 jeroen

It happens also for other package: https://cran.r-project.org/web/checks/check_results_clock.html Only the MacOs runner, which are using R 4.3.3 and not R 4.4. So if they removed the _R_CHECK_VIGNETTES_SKIP_RUN_MAYBE_ env var, and it is using R default, it will be false for this R 4.3.3 version.

cderv avatar Jul 23 '24 14:07 cderv

I have contacted Simon to set _R_CHECK_VIGNETTES_SKIP_RUN_MAYBE_=true for R-oldrel on CRAN's macOS machines (which I just met in person last week in New Zealand :). Hopefully these check errors should start to disappear on CRAN in the next few days.

For package authors who do not use the --as-cran flag for R CMD check on Github Action, you still need to set this environment variable for R < 4.4.0 (or use the chunk option purl = FALSE for error = TRUE chunks).

yihui avatar Sep 18 '24 04:09 yihui