later icon indicating copy to clipboard operation
later copied to clipboard

REPRODUCIBILITY: .Random.seed is updated when 'later' is loaded

Open HenrikBengtsson opened this issue 1 year ago • 1 comments

Issue

Loading later causes the RNG state to be updated, e.g.

$ R --quiet --vanilla
> str(globalenv()$.Random.seed)
 NULL
> loadNamespace("later")
<environment: namespace:later>
> str(globalenv()$.Random.seed)
  int [1:626] 10403 624 607059698 1535537611 ...

Why is this a problem? This makes it near impossible to get numerically reproducible results in parallel processing where the persistent workers are used (e.g. parallel::makeCluster()), because the results will depend on a package dependencies already being loaded or not. This affects all packages importing later directly or indirectly.

The only workaround for this is to (i) know what packages might be loaded up-front, and (ii) pre-load them all on the parallel workers before performing the actually tasks. In practice, that's not feasible.

Suggestion

I don't know what the RNG is used for during .onLoad(), but could one solution be do draw the random number in stealth mode, i.e. make sure to undo .Random.seed afterward?

Details

It's not one of the package dependencies that forwards the RNG;

$ R --quiet --vanilla
> str(globalenv()$.Random.seed)
 NULL
> loadNamespace("Rcpp")
> str(globalenv()$.Random.seed)
 NULL
> loadNamespace("rlang")
<environment: namespace:rlang>
> str(globalenv()$.Random.seed)
 NULL
> loadNamespace("later")
<environment: namespace:later>
> str(globalenv()$.Random.seed)
  int [1:626] 10403 624 1853159584 1558919201 ...
> sessionInfo()
R version 4.2.1 (2022-06-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.5 LTS

Matrix products: default
BLAS:   /home/hb/shared/software/CBI/R-4.2.1-gcc9/lib/R/lib/libRblas.so
LAPACK: /home/hb/shared/software/CBI/R-4.2.1-gcc9/lib/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] compiler_4.2.1 cli_3.4.1      later_1.3.0    Rcpp_1.0.9     rlang_1.0.6

HenrikBengtsson avatar Oct 25 '22 20:10 HenrikBengtsson

My guess is that this is caused by the use of Rcpp::RNGScope in the auto-generated file RcppExports.cpp: https://github.com/r-lib/later/blob/ba70887d77527e5647e375013e2e1ad9dc2c3646/src/RcppExports.cpp

There's one more use of RNGScope in later.cpp: https://github.com/r-lib/later/blob/ba70887d77527e5647e375013e2e1ad9dc2c3646/src/later.cpp#L194

I believe those are the only places where later interacts with R's random number generator.

Here's a simple example with just Rcpp:

str(globalenv()$.Random.seed)
#>  NULL

Rcpp::cppFunction('int go() { return 1; }')
str(globalenv()$.Random.seed)
#>  NULL

go()
#> [1] 1
str(globalenv()$.Random.seed)
#>  int [1:626] 10403 624 -1688476694 -149789597 758540872 1648411561 -1149942954 1420315103 -1604158828 -374756219 ...

It looks like when the Rcpp-wrapped C++ function is run, that causes the random seed to be set. The later package runs some of its C++ functions on load, and that's probably what's causing the seed to be set.

Note that if the random seed is already set (to a non-NULL value) before loading later, then it does not alter the seed:

rnorm(1)
#> [1] -0.484127
str(globalenv()$.Random.seed)
#>  int [1:626] 10403 2 -1539884961 -1138419659 1420516300 1450512162 -1641721475 806492791 -2033368162 541053528 ...

Rcpp::cppFunction('int go() { return 1; }')
str(globalenv()$.Random.seed)
#>  int [1:626] 10403 2 -1539884961 -1138419659 1420516300 1450512162 -1641721475 806492791 -2033368162 541053528 ...

go()
#> [1] 1
str(globalenv()$.Random.seed)
#>  int [1:626] 10403 2 -1539884961 -1138419659 1420516300 1450512162 -1641721475 806492791 -2033368162 541053528 ...

wch avatar Oct 25 '22 20:10 wch