immediateCondition on delayed tasks
Describe the bug
Similar to https://github.com/futureverse/future.mirai/issues/17, I'm adapting immediateMessage as mentioned in https://github.com/futureverse/progressr/issues/176 for multisession.
It seems to work for the initial tasks, but others require an attempt to realize the value before it is printed.
Reproduce example
I'll source all of this code first. Since I'm starting two workers, the x and y expressions start immediately, and the third is intended to be queued and will process when the first two clear.
immediateMessage <- function (..., domain = NULL, appendLF = TRUE) {
# prepend the time, perhaps sloppy but good for demonstration
msg <- .makeMessage(c(format(Sys.time()), " ", ...), domain = domain, appendLF = appendLF)
call <- sys.call()
cond <- simpleMessage(msg, call)
class(cond) <- c(class(cond), "immediateCondition")
message(cond)
}
library(future)
plan(multisession, workers=2)
x %<-% { immediateMessage("in x"); Sys.sleep(2); immediateMessage(paste("x ", Sys.getpid())); Sys.sleep(2); 3.14 }
y %<-% { immediateMessage("in y"); Sys.sleep(2); immediateMessage(paste("y ", Sys.getpid())); Sys.sleep(2); 2.71 }
z %<-% { immediateMessage("in z"); Sys.sleep(2); immediateMessage(paste("z ", Sys.getpid())); Sys.sleep(2); 1.62 }
x+y
If I source that block, then I see
library(future)
plan(multisession, workers=2)
x %<-% { immediateMessage("in x"); Sys.sleep(2); immediateMessage(paste("x ", Sys.getpid())); Sys.sleep(2); 3.14 }
y %<-% { immediateMessage("in y"); Sys.sleep(2); immediateMessage(paste("y ", Sys.getpid())); Sys.sleep(2); 2.71 }
z %<-% { immediateMessage("in z"); Sys.sleep(2); immediateMessage(paste("z ", Sys.getpid())); Sys.sleep(2); 1.62 }
# 2025-01-31 11:41:22 in x
# 2025-01-31 11:41:22 in y
# 2025-01-31 11:41:24 x 9219
# 2025-01-31 11:41:24 y 9218
x+y
# [1] 5.85
As I said in the other issue, the fact that x+y shows after the messages indicates that some messages are immediate.
Once that's done, I pause for a moment and see that the z messages do not appear. If I then manually type z in the console, I immediately see:
z; Sys.time()
# 2025-01-31 11:41:26 in z
# 2025-01-31 11:41:28 z 9219
# [1] 1.62
# [1] "2025-01-31 11:42:25 EST"
It is clear to me that the execution was in fact scheduled immediately after one of x or y completed, but the messages themselves did not appear until I evaluated z (and the 11:42:25 indicates the time between when I pasted here, then typed some comments, and then went back to the console, demonstrating that the z code executed correctly on time).
A reproducible example using R code.
Please format your inline code and code blocks using Markdown (https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax).
Expected behavior
I expect the z messages would appear on the console immediately after one of the x or y tasks is complete.
Session information
sessionInfo()
R version 4.3.3 (2024-02-29) Platform: aarch64-apple-darwin20 (64-bit) Running under: macOS 15.2 Matrix products: default BLAS: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 time zone: America/New_York tzcode source: internal attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] future.mirai_0.2.2 future_1.34.0 r2_0.12.0 loaded via a namespace (and not attached): [1] mirai_2.0.1 digest_0.6.37 codetools_0.2-20 fastmap_1.2.0 xfun_0.49 nanonext_1.5.0 knitr_1.49 parallel_4.3.3 htmltools_0.5.8.1 [10] rmarkdown_2.29 cli_3.6.3 parallelly_1.40.1 compiler_4.3.3 globals_0.16.3 tools_4.3.3 listenv_0.9.1 clipr_0.8.0 evaluate_1.0.1 [19] rlang_1.1.5
I expect the
zmessages would appear on the console immediately after one of thexorytasks is complete.
The technical explanation is that you'll get nothing from the z future unless it's "queried". One one to query it is by "reading" z, e.g. print(z). That will trigger value() on this future. It's a bit easier to clarify with a regular Future object what the other options are. If you had used the equivalent:
f_z <- future({ immediateMessage("in z"); Sys.sleep(2); immediateMessage(paste("z ", Sys.getpid())); Sys.sleep(2); 1.62 })
you'll get information back from this future by calling value(f_z) or resolved(f_z).
Now, another scenario where this future might be queried is you create additional futures, but there are no free ('multisession') workers available. Then the future framework will start checking with all known futures if they are resolved (by calling resolved(f) on each of them). If one resolved future is found, then it's results are collected to free up the parallel worker. We well get immediate conditions from all futures that are queried this way, but there is no way to control which are queried. If you're unlucky, your long running future will never be queried this way, and therefore we won't get any immediate notifications from it either.
The only way I can imagine to get what you want, we need to make sure resolved(f_z) is called, whenever another future is queried. Doing that might be expensive, especially if there are a lot of futures. Also, if a future is ready, then resolved() will pull home the full result for that future from the worker. That will happen at some point, but it might be confusing if calling value(f_x) takes a very long time just because resolved(f_z) gets calls as well.
I agree that it would be neat to receive immediate conditions even sooner here, but I'll have to thing more about this one before implementing any such changes.
Thanks, that adds a lot of context, and I understand the cost of effectively "polling" for messages.