future.batchtools icon indicating copy to clipboard operation
future.batchtools copied to clipboard

setting .future as defaultRegistery

Open yonicd opened this issue 7 years ago • 7 comments

is there a wrapper in future.batchtools to setdefaultRegister() the .future subdirectory. this would open up the possibility of using getStatus.

the analogues in batchtools are

batchtools::setDefaultRegistry(tmp)
batchtools::getStatus()

yonicd avatar Jun 29 '18 18:06 yonicd

i got this far

> my_sge <- future::tweak(future.batchtools::batchtools_sge, template = 'batchtools.sge-new.tmpl')
> future::plan(list(multiprocess, my_sge))
> Y1 %<-% future_lapply(rep(300, 20),
+                       FUN = function(nr){solve( matrix(rnorm(nr^2), nrow=nr, ncol=nr))},future.scheduling = 5)
> x <- list.files('.future',full.names = TRUE,recursive = TRUE,pattern = 'registry')
> class(readRDS(x[1]))
[1] "Registry"
> batchtools::getStatus(reg = readRDS(x[1]))
Error in reg$writeable && !identical(reg$mtime, file_mtime(fs::path(reg$file.dir,  : 
  invalid 'x' type in 'x && y'

yonicd avatar Jun 29 '18 18:06 yonicd

All such .future/ batchtools folders get wiped as soon as results from the future have been collected (unless there's an error - then it leaves it to simplify troubleshooting). There is a non-official, non-documented option you can set to prevent this cleanup; options(future.delete = FALSE). However, treat it is a prototype feature that may go away in the future (although it's been there from the start).

HenrikBengtsson avatar Jun 29 '18 19:06 HenrikBengtsson

Related to your https://github.com/HenrikBengtsson/future.apply/issues/1#issuecomment-401422632 question:

If you're only interested in the batchtools output (standard output and standard error, depending on your job template settings), in the most recent version, that's actually brought into the future objects together with the value. Again, this is not official and will change, but I added it in preparation for / prototyping https://github.com/HenrikBengtsson/future/issues/232:

> library(future)
> plan(future.batchtools::batchtools_local)

> f <- future({ cat("hello world\n"); 42 })
> value(f)
[1] 42

> result(f)$stdout
 [1] "### [bt]: This is batchtools v0.9.10.9000"                                       
 [2] "### [bt]: Starting calculation of 1 jobs"                                        
 [3] "### [bt]: Setting working directory to '/home/hb/repositories/future.batchtools'"
 [4] "### [bt]: Memory measurement disabled"                                           
 [5] "### [bt]: Starting job [batchtools job.id=1]"                                    
 [6] "### [bt]: Setting seed to 1 ..."                                                 
 [7] "hello world"                                                                     
 [8] ""                                                                                
 [9] "### [bt]: Job terminated successfully [batchtools job.id=1]"                     
[10] "### [bt]: Calculation finished!"  

As you see, there's more output than just what you output, so this will have to change, especially since everything should work the same regardless what backend you use.

HenrikBengtsson avatar Jun 29 '18 19:06 HenrikBengtsson

Forgot to say, when using future_*apply() you won't have access to Future objects, so you cannot use access the captured output this way. When HenrikBengtsson/future#232 is implemented, you'll be able to treat/get standard output just as if you do when you use *apply().

HenrikBengtsson avatar Jun 29 '18 19:06 HenrikBengtsson

Got it. I am trying to connect my package to future, it creates tidy outputs for sge. It currently piggy backs on another scheduling package qapply, but mostly polls the sge xml.

yonicd avatar Jun 29 '18 19:06 yonicd

Nice. So, are you looking into making that connection via batchtools, or via a standalone future.qibble backend?

HenrikBengtsson avatar Jun 30 '18 00:06 HenrikBengtsson

I’d rather do it on top of future, and generalize beyond sge

yonicd avatar Jun 30 '18 02:06 yonicd