clustermq
clustermq copied to clipboard
run asynchronously
Thanks for this package! It's very easy to use.
I'd like to ask if it's possible to run a job asynchronously, without waiting for the results.
For example, when I run:
job <- Q(fx, x=1:3, n_jobs=1)
I get:
Submitting 1 worker jobs for 3 function calls (ID: 6642) ...
|======================================================================| 100%
Running calculations (1 calls/chunk) ...
| | 0%
|======================================================================| 100%
Master: [19.0s 0.0% CPU]; Worker average: [11.7% CPU]
While this message is being printed, I'm unable to continue executing commands. I have to wait for the submitted job to complete before I can continue working.
Is it possible to let R wait in the background, so I can continue working?
Thank you! Always good to hear that my utility code is useful :+1:
I'm currently not planning to add this to the package because I think asynchronous computation is a separate problem that is outside of the scope of functionality.
However, you could easily do something like:
fx = function(n) {
Sys.sleep(n)
n * 2
}
p = parallel::mcparallel(clustermq::Q(fx, n=1:5, n_jobs=1))
# do your other work ...
parallel::mccollect(p)[[1]]
You may additionally want to suppress the progress bar.
Despite the discussions in #86 and https://github.com/HenrikBengtsson/future/issues/204, I am still interested in an asynchronous Q()
. Yes, asynchronicity is a separate problem, and I understand the need to set clear boundaries for the package's scope. But speaking generally, I think the need for asynchronicity arises frequently enough that the major alternatives to clustermq
support it:
-
rslurm::slurm_apply()
andrslurm::slurm_call()
- asynchronous
future
s -
callr::r_bg()
-
processx::process$new()
-
parallel::mcparallel()
-
system2(wait = FALSE)
andsystem(wait = FALSE)
I am also curious about what it would take. Do we need different socket types? How much would we accomplish if we
-
Set
dont.wait
toTRUE
in worker(), and - Expose a non-blocking collector in the
QSys
class?
And while a potential clustermq
backend for future
may give us asynchronicity, I really like the API you have designed natively, both for Q()
and the R6
wrapper around reusable workers.
I think this is fixed on the develop
branch via #86. (But maybe it needs documentation.) Example: https://github.com/ropensci/drake/blob/master/R/clustermq.R#L32-L48.
No, these are unrelated: here is to run Q
in the background, there to interface with workers directly
Note: an option would be to create a promise object that will wait for results only explicitly if it is accessed; this could even be result[1:5]
waiting only for the first 5