future
future copied to clipboard
R options: options() are not passed down to futures - should they? (document either way)
I'm adding this as a reminder to myself to document this (next release), but also to consider more automated solutions (a future release):
R's global options()
are not passed down to futures. This has to be done manually right now, e.g.
fopts <- options()[c("digits", "repos")]
f <- future({
options(fopts)
[...]
})
This works, but it would be nice to have an automated version of this as well (cf. globals and packages) to better capture the state of the calling R process. Doing this might be a bit tricky since some options are read only / shouldn't be changed and others are specific the local machine, which may not work on a remote machine. Maybe it's possible to use static-code inspection to identify what options are used (similar to how globals are identified).
Some ideas:
Ignoring any "automatic" passing down of options, this could be done by adding an argument options
analogue to globals
, e.g.
f <- future(..., options = c("digits", "repos"))
That construct provides the developer control over options. For the end user to control options, there would need to be a similar argument that can be specified when setting the plan()
, e.g.
plan(batchtools_sge, workers = 100L, options = c("digits", "repos"))
Options specified via plan()
will always be set/passed down and additional ones specified via future()
would be done in addition to what's in plan()
.
So, future.global.maxSize
is also not passed down. That leaves me in a pickle.
I send jobs to a Torque cluster from my local. The jobs call brms, which has a future = TRUE
argument, so that each of its chains is run according to plan
. But these chains are bigger than 500Mb so this fails and I cannot change it, because I cannot alter options in the futures that brms creates.
I think all future options should always be passed down.
Good point, I agree that future.globals.maxSize
should be passed down - this is an oversight. As a workaround for now, you can override the default in the ~/.Rprofile
on the machine in the chain that complains, e.g.
options(future.globals.maxSize = 1.0 * 1024^3)
Moving the discussion of future.*
options to Issue #223 because it's a much more narrow problem than considering any/all of R's options.
FYI/@rubenarslan, in the next release (develop branch right now), we're also passing down:
-
future.globals.maxSize
-
future.globals.method
-
future.globals.onMissing
-
future.globals.onReference
-
future.globals.resolve
-
future.resolve.recursive
FYI, future 1.9.0, which passes down several more future.globals.*
options, is now on CRAN.
I don't see any difference, did I do something wrong ?
sessionInfo() R version 3.5.0 (2018-04-23) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 17134)
Matrix products: default
locale: [1] LC_COLLATE=French_France.1252 LC_CTYPE=French_France.1252 LC_MONETARY=French_France.1252 LC_NUMERIC=C LC_TIME=French_France.1252
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] future.apply_1.0.0 future_1.9.0
loaded via a namespace (and not attached): [1] compiler_3.5.0 parallel_3.5.0 tools_3.5.0 listenv_0.7.0 codetools_0.2-15 digest_0.6.15 globals_0.12.1
options("test"=10) FUN=function(x) return(getOption("test")) xx=future.apply::future_lapply(X=list(0),FUN=FUN) xx
[[1]] NULL
Nah, please read the above comments again. future 1.9.0 only pass down some specific future.globals.*
options. The task on passing done options automagically is still an open question.
Ok, I have read the comment but I just understand it now. And I couldn't understand the manual fix: fopts <- options()[c("digits", "repos")] f <- future({ options(fopts) [...] }). I tried f <- future({ options(fopts) },FUN=FUN,X=list(0)) and several others stuff, but I can't make it.
Here's an example:
library(future)
plan(multisession) # Use background R sessions
## My personal option
options(test = 5)
## Use only 3 digits
options(digits = 3L)
## Options that I want to pass down to the futures/workers
fopts <- options()[c("digits", "test")]
str(fopts)
# List of 2
# $ digits: int 3
# $ test : num 5
# Create a future using those "frozen" 'fopts' options
> f <- future({
+ ## Set 'fopts' options (here fopts is a global variable)
+ options(fopts)
+ ## Show that those options are set
+ cat(sprintf("test = %s\n", getOption("test")))
+ print(pi)
+ 2 * getOption("test")
+ })
> v <- value(f) ## requires future 1.9.0 to see output
test = 5
[1] 3.14
> v
[1] 10
I tried f <- future({ options(fopts) },FUN=FUN,X=list(0)) and several others stuff, but I can't make it.
Are you asking how to do the same with future.apply::future_lapply()
?
Are you asking how to do the same with future.apply::future_lapply()? YES ! I understand your example but I don't see how I could transfer it to future_lapply.
Thanks
The same way, via global fopts
, e.g.
> library(future.apply)
> plan(multiprocess)
> options(test = 5)
> options(digits = 3L)
> fopts <- options()[c("digits", "test")]
> y <- future_lapply(1:3, FUN = function(x) {
+ ## Set 'fopts' options (here fopts is a global variable)
+ options(fopts)
+ ## Show that those options are set
+ cat(sprintf("test = %s\n", getOption("test")))
+ print(pi)
+ getOption("test") * x
+ })
test = 5
[1] 3.14
test = 5
[1] 3.14
test = 5
[1] 3.14
> unlist(y)
[1] 5 10 15
>
Thanks a lot. I got another one:
library("future.apply") plan(multiprocess, workers = 12) test=function(...){ UseMethod("test") } test.numeric=function(x){ return(x) } test.character=function(x){ return(x) } FUN=function(x) { test(x) } xx=future.apply::future_lapply(FUN=FUN,X=c(2,3))
I also have another one but I can't (yet) reproduce it on small code. (It can't find roll_sd from RcppRoll).
Please create a separate issue if you're changing topics
Hello! Just stumbled upon this when my code started behaving strangely and realized after a few hours of debugging that the "memory_saving_mode" option had defaulted to TRUE because future wasn't passing options on. Maybe could be suitable with a warning about this?
I'm using future version 1.11.1.1 through Drake, calling future:plan(future::multiprocess)
. What is the recommended way to pass options to workers?
Hello Adam,
I have used a workaround. I am reloading each worker independently. More CPU/bandwith, less work :)
Best
Le mer. 30 oct. 2019 à 12:23, Adam Altmejd [email protected] a écrit :
Hello! Just stumbled upon this when my code started behaving strangely and realized after a few hours of debugging that all the "memory_saving_mode" options had defaulted to TRUE because future wasn't passing options on. Maybe could be suitable with a warning about this?
I'm using future version 1.11.1.1 through Drake https://github.com/ropensci/drake, calling future:plan(future::multiprocess). What is the recommended way to pass options to workers?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/HenrikBengtsson/future/issues/134?email_source=notifications&email_token=AEGAXTQ4PX2LR6AVQR6TKATQRFVDXA5CNFSM4DD7FPPKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECTZU4Q#issuecomment-547854962, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEGAXTUMAV3NL23YANAD45TQRFVDXANCNFSM4DD7FPPA .
... the "memory_saving_mode" option had defaulted to TRUE because future wasn't passing options on. Maybe could be suitable with a warning about this?
So, I'm objecting that a warning could be useful but it's unfortunately not feasible. How would future know what options to warn about and not? Feel free to clarify how you want it to work.
I'm using future version 1.11.1.1 ...
I highly recommend that you update - you are way behind and you're missing out on bug fixes and features. Is there a reason why you're not updating?
... What is the recommended way to pass options to workers?
You can do it via a global variable, e.g.
my_opts <- options(c("memory_saving_mode", "digits"))
f <- future({
options(my_opts)
...
})
v <- value(f)
Tack :).
You are probably right. But also package options could create problems if they are changed from their defaults. Same for base R options that are changed. One possibility is to issue a warning when the package is loaded, but that might be too extreme. I guess the best would be to pass options on to works by default and then allowing the user to shut that down in the function call.
As I was saying I'm calling future via drake so am not explicitly creating any "future" objects. Only setting plan(multiprocess) and then run drake::make(parallelism = "future")
so not sure how to add global variables like you suggested - but I guess thats a question for Drake.
Maybe the problem of environment variables could be added to this thread.
Identifying automagically which options and environment variables should be passed down to workers is hard, indeed.
I would like to add a further idea as well - not a silver bullet, but sometimes useful:
-
In
.onLoad()
, try to figure out what the default options/environment variables in the host and in the worker are, and store them in.options_at_startup
. Flag those options which a) do not exist in the worker, but exist in the host, and b) those which do differ between the worker and the host. -
When a future is created, check the actual options and environment variables, compare with
.options_at_startup
, and inject the relevant options and env variables to the future.
This is of course a rude approximation, can result in both false positives and false negatives.