Allow memoising output, warnings, etc.?
The future package saves output and warnings from code executed within futures and "replays" them when the future's value is requested. Note the order in which the messages and warnings appear in these examples:
library(future)
x <- future(message("Hello from the future!"))
message("Greetings from the present!")
#> Greetings from the present!
value(x)
#> Hello from the future!
invent_skynet <- function() {
message("Inventing Skynet...")
f <- future(warning("Killer robots from the future!"))
message("Finished inventing Skynet.")
return(f)
}
value(invent_skynet())
#> Inventing Skynet...
#> Finished inventing Skynet.
#> Warning in eval(quote(warning("Killer robots from the future!")), new.env()):
#> Killer robots from the future!
Created on 2021-04-26 by the reprex package (v2.0.0)
In theory, it should be equally possible to capture the standard output and standard error of each call to a memoised function and replay them each time the function is called with the same arguments. This could be useful for cases where a memoised function issues a warning, but that warning is not noticed on the first run. It would be nice if subsequent memoised calls to the same function could also produce the warning, to increase the chances that the user notices it.
Now that I think about it, I suppose the reprex package also encapsulates the same functionality, which could possibly be reused for this purpose.
I'm not convinced that's a good idea. Although I can see that it could be useful in some cases, I can also think of cases where re-throwing a warning would be problematic and misleading.
In cases where the (A) the presence of a warning and (B) the content of the warning both depend entirely on the input values, it makes sense. For example, it would make sense for memoized version of mean() to re-throw this warning:
mean("abc")
#> [1] NA
#> Warning message:
#> In mean.default("abc") : argument is not numeric or logical: returning NA
However, if either (A) the presence of a warning and (B) the content of the warning do not depend entirely on the input values, then then re-throwing a warning would be misleading:
download_text <- function(url) {
outfile <- tempfile()
download.file(url, outfile)
on.exit(file.remove(file.path(outfile, "xyz")))
invisible(readLines(outfile))
}
x <- download_text("https://www.r-project.org/")
#> trying URL 'https://www.r-project.org/'
#> Content type 'text/html' length 6328 bytes
#> ==================================================
#> downloaded 6328 bytes
#>
#> Warning message:
#> In file.remove(file.path(outfile, "foo")) :
#> cannot remove file '/var/folders/vd/0_g4hj6d7kq_fw5gd_r0ml5w0000gn/T//Rtmp0xsxOb/file7ae45783664a/foo', reason 'Not a directory'
In this case, the function throws a warning that points to a specific path on disk. (The warning is due to a bug in the original function, but that's not really important to illustrate this point.) If the warning was re-thrown in the future, it would be providing incorrect information.
There could be other things that could affect (A) or (B), such as the state of a global variable, or a remote data source that's being accessed. Unfortunately, there's no way to know whether a warning is "pure" (as in the mean() example) or not.
I agree it probably doesn't make sense as a default, but perhaps as an option? "Pure" warnings like the mean example are exactly the kind I'm interested in (specifically warnings about poor fits in model fitting functions).