future
future copied to clipboard
Re-enable forked processing in RStudio Jobs and RStudio Terminal?
@graemeblair wrote in https://github.com/HenrikBengtsson/future/issues/299#issuecomment-501318832:
Hi -- late to this, but wondering if
supportsMulticore()
should returnTRUE
when run within RStudio 1.2 "jobs" feature (and stillFALSE
when run interactively). As I understand it, when code is run through jobs it is separated from the GUI interactions that complicate usingplan(multicore)
in RStudio run interactively. This would enable the same scripts that useplan(multicore)
to be used in and out of RStudio with a bit less hassle.
I took the conservative approach and disabled forked processing anywhere in RStudio by testing for environment variable RSTUDIO
.
It could be that it's only unsafe when running R via the RStudio Console (the "RStudio GUI"), but that it indeed works when running R via the RStudio Terminal (Tools -> Terminal -> New Terminal) or as an RStudio Job. I know one can distinguish RStudio Console from RStudio Terminal as:
is_rstudio_console <- function() {
(Sys.getenv("RSTUDIO") == "1") && !nzchar(Sys.getenv("RSTUDIO_TERM"))
}
is_rstudio_terminal <- function() {
(Sys.getenv("RSTUDIO") == "1") && nzchar(Sys.getenv("RSTUDIO_TERM"))
}
Source: https://github.com/HenrikBengtsson/startup/blob/0.12.0/R/is_rstudio.R
-
Could someone check how to detect whether R via an RStudio Job or not?
-
Then we also need to reach out to the RStudio GUI folks about forked processing in the above three cases of running R from RStudio.
I just stumbled across this; since the terminal
feature of RStudio is fairly new, I would recommend adding a version check to RStudio
as well, perhaps along the lines of bigKRLS
's:
https://github.com/rdrr1990/bigKRLS/blob/master/R/utils.R
Thxs, though, I'm not sure I understand. The gist of what we want to disable multicore processing when in RStudio Console, which we could detect using
(Sys.getenv("RSTUDIO") == "1") && !nzchar(Sys.getenv("RSTUDIO_TERM"))
Shouldn't that work also with older versions of RStudio that does not use RSTUDIO_TERM
.
What I don't know is if RStudio Jobs run in yet another kind of environment and if there's a way to detect this.
Thanks for opening this @HenrikBengtsson.
The following set of flags work in R 3.6.0 on OS X with RStudio 1.2.1139:
is_rstudio_console <- function() {
(Sys.getenv("RSTUDIO") == "1") && !nzchar(Sys.getenv("RSTUDIO_TERM")) && !nzchar(Sys.getenv("SHLVL"))
}
is_rstudio_terminal <- function() {
(Sys.getenv("RSTUDIO") == "1") && nzchar(Sys.getenv("RSTUDIO_TERM"))
}
is_rstudio_job <- function() {
(Sys.getenv("RSTUDIO") == "1") && !nzchar(Sys.getenv("RSTUDIO_TERM")) && nzchar(Sys.getenv("SHLVL"))
}
Not sure about on other OS's or if this will generally work. I discovered these through inspecting Sys.getenv()
in each of the three environments.
FYI, I've posted a follow-up question to RStudio about this to https://github.com/rstudio/rstudio/issues/2597#issuecomment-502305531.
Graeme's code seems to work nicely on my system too. (Arch Linux with R 3.6.6 and RStudio 1.2.1335.)
testing SHLVL gives incorrect result if start Rstudio from shell
interactive() can distinct if this is Rstudio job, since Rstudio said
Local jobs run as non-interactive child R processes of your main R process
I've tested that running interactive() as local job gives FALSE
Hi @HenrikBengtsson - the RStudio issue that was opened related to this seems to have been closed more than a year ago (https://github.com/rstudio/rstudio/issues/2597) as resolved in PR https://github.com/rstudio/rstudio/pull/6492. Are there broader issues about forking preventing multicore being allowed for use again with RStudio?
Hi. Are you asking specifically about running R via the RStudio Terminal or RStudio Jobs, as this issue referring to, or are you asking about doing in the RStudio Console (which most people mean when they say "in RStudio")?
For the RStudio Console, I don't think anything has changed, cf. https://github.com/rstudio/rstudio/issues/2597#issuecomment-482187011. If something has changed, would you mind asking the RStudio folks to confirm.
If you want to be a daredevil, see ?parallelly::supportsMulticore
for how to re-enable forked processing in RStudio.
Hello Henrik et al. I stumbled onto this issue myself recently. Was there ever a clear answer to this question?
Is it only when running R from the RStudio Console, that the RStudio environment affects forked processing? What about RStudio Terminal and recent RStudio Jobs? Are such R processes affected at all by the RStudio environment, or can they be considered plain R process similar to those you run in the terminal?
I looked through this and a few related threads and I can't seem to find anywhere where someone gives an answer to that, either empirically or conceptually.
For more background context, there was a situation where I recommended someone run plan(multicore)
but to run the script in the terminal instead of the RStudio console. The person did this, but said it was still only using one core (based on looking at htop
). Then they told me that running the same code in a non-RStudio terminal ran on all the cores. Based on this thread, and the fact that this issue is still open, I think that's the expected behavior.
What I'm trying to decide is: do I need to tell them to launch a terminal outside of RStudio, or should I just tell them to add options(future.fork.enable = TRUE)
to their script and run it from the terminal or jobs window in RStudio?
Thanks in advance for any help or guidance.
Hello, could you please reach out to the RStudio folks and ask them in which "parts" of RStudio GUI forked parallelization is not stable? They're the ones that would be best to answer this.
If not clear, this is not specific to Futureverse, but a problem with forked parallelization in general. I just added this protection to Futureverse because it was such a common problem. I'd be happy to relax it, if it can be confirmed it's safe to do so.
Ok, thank you. I've posted something here. We'll see if we learn anything. Thanks again for all of your continued work on this!
Hello again. The community.rstudio post I made has gone 18 days with any reply, and I just noticed it says "This topic will close 21 days after the last reply" at the bottom. Does anyone know anybody who is active over there that we could ping about this?
If not, oh well. But it would be great to at least get this on Posit's radar to look into.