clustermq icon indicating copy to clipboard operation
clustermq copied to clipboard

Worker API does not properly document requirements for `common_data`

Open wlandau opened this issue 4 years ago • 3 comments

I am trying to reconstruct an iteration of the worker API event loop outside targets to diagnose a different issue, and I am running into trouble. What am I doing wrong at "WORKER_ERROR: wrong field names for DO_SETUP"?

options(clustermq.scheduler = "multiprocess")
library(clustermq)
envir <- new.env(parent = emptyenv())
envir$global <- "value"
w <- workers(n_jobs = 1)
w$set_common_data(export = list(global = 123))
#> [1] "aiqzy"
x <- w$receive_data()
w$send_common_data()
x <- w$receive_data()
#> Error in w$receive_data(): 
#> WORKER_ERROR: wrong field names for DO_SETUP:
w$send_call(global)
x <- w$receive_data()
x$result
#> [1] "Error in eval(msg$expr, envir = msg$env) : object 'global' not found\n"
#> attr(,"class")
#> [1] "try-error"
#> attr(,"condition")
#> <simpleError in eval(msg$expr, envir = msg$env): object 'global' not found>

Created on 2021-03-24 by the reprex package (v1.0.0)

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.0.3 (2020-10-10)
#>  os       macOS Catalina 10.15.7      
#>  system   x86_64, darwin17.0          
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       America/New_York            
#>  date     2021-03-24                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version  date       lib source        
#>  assertthat    0.2.1    2019-03-21 [1] CRAN (R 4.0.0)
#>  backports     1.2.1    2020-12-09 [1] CRAN (R 4.0.2)
#>  callr         3.5.1    2020-10-13 [1] CRAN (R 4.0.2)
#>  cli           2.3.1    2021-02-23 [1] CRAN (R 4.0.2)
#>  clustermq   * 0.8.95.1 2020-07-13 [1] CRAN (R 4.0.2)
#>  codetools     0.2-18   2020-11-04 [1] CRAN (R 4.0.2)
#>  crayon        1.4.1    2021-02-08 [1] CRAN (R 4.0.2)
#>  debugme       1.1.0    2017-10-22 [1] CRAN (R 4.0.2)
#>  digest        0.6.27   2020-10-24 [1] CRAN (R 4.0.2)
#>  ellipsis      0.3.1    2020-05-15 [1] CRAN (R 4.0.0)
#>  evaluate      0.14     2019-05-28 [1] CRAN (R 4.0.0)
#>  fansi         0.4.2    2021-01-15 [1] CRAN (R 4.0.2)
#>  fs            1.5.0    2020-07-31 [1] CRAN (R 4.0.2)
#>  glue          1.4.2    2020-08-27 [1] CRAN (R 4.0.2)
#>  highr         0.8      2019-03-20 [1] CRAN (R 4.0.0)
#>  htmltools     0.5.1.1  2021-01-22 [1] CRAN (R 4.0.2)
#>  knitr         1.31     2021-01-27 [1] CRAN (R 4.0.2)
#>  lifecycle     1.0.0    2021-02-15 [1] CRAN (R 4.0.3)
#>  magrittr      2.0.1    2020-11-17 [1] CRAN (R 4.0.2)
#>  pillar        1.5.1    2021-03-05 [1] CRAN (R 4.0.2)
#>  pkgconfig     2.0.3    2019-09-22 [1] CRAN (R 4.0.0)
#>  processx      3.4.5    2020-11-30 [1] CRAN (R 4.0.2)
#>  ps            1.6.0    2021-02-28 [1] CRAN (R 4.0.3)
#>  purrr         0.3.4    2020-04-17 [1] CRAN (R 4.0.0)
#>  R6            2.5.0    2020-10-28 [1] CRAN (R 4.0.2)
#>  Rcpp          1.0.6    2021-01-15 [1] CRAN (R 4.0.2)
#>  reprex        1.0.0    2021-01-27 [1] CRAN (R 4.0.2)
#>  rlang         0.4.10   2020-12-30 [1] CRAN (R 4.0.2)
#>  rmarkdown     2.7      2021-02-19 [1] CRAN (R 4.0.3)
#>  sessioninfo   1.1.1    2018-11-05 [1] CRAN (R 4.0.0)
#>  stringi       1.5.3    2020-09-09 [1] CRAN (R 4.0.2)
#>  stringr       1.4.0    2019-02-10 [1] CRAN (R 4.0.0)
#>  styler        1.3.2    2020-02-23 [1] CRAN (R 4.0.2)
#>  tibble        3.1.0    2021-02-25 [1] CRAN (R 4.0.3)
#>  utf8          1.2.1    2021-03-12 [1] CRAN (R 4.0.2)
#>  vctrs         0.3.6    2020-12-17 [1] CRAN (R 4.0.2)
#>  withr         2.4.1    2021-01-26 [1] CRAN (R 4.0.2)
#>  xfun          0.21     2021-02-10 [1] CRAN (R 4.0.2)
#>  yaml          2.2.1    2020-02-01 [1] CRAN (R 4.0.0)
#> 
#> [1] /Library/Frameworks/R.framework/Versions/4.0/Resources/library

wlandau avatar Mar 24 '21 20:03 wlandau

The common data expects the following fields: id, fun, const, export, pkgs, rettype, common_seed, token (also see here)

The worker checks if these fields are present, and will display an error if not.

w$set_common_data() sets some implicitly, but not most.

So you need to provide these arguments. The following will work:

options(clustermq.scheduler = "multiprocess")
library(clustermq)
envir <- new.env(parent = emptyenv())
envir$global <- "value"
w <- workers(n_jobs = 1)
w$set_common_data(fun=identity, const=list(), pkgs=c(), common_seed=123, rettype="list",
                  export = list(global = 123)) # changed
x <- w$receive_data()
w$send_common_data()
x <- w$receive_data()

However, this is not clear from the error so this is a (minor) API (and documentation) bug.

mschubert avatar Mar 25 '21 15:03 mschubert

Thanks, including those extra arguments does work.

I noticed it also worked without including id. Will id be required at some point? If so, I will update targets.

wlandau avatar Mar 25 '21 17:03 wlandau

That was a mistake, id should not be provided. Fixed above, thanks for pointing that out.

mschubert avatar Mar 25 '21 17:03 mschubert

No longer relevant with the v0.9 rewrite because any objects can be added to the worker environment (and none are required)

mschubert avatar Mar 26 '23 21:03 mschubert