tmbstan icon indicating copy to clipboard operation
tmbstan copied to clipboard

Parallel chains - worker nodes can't find tmbstan library when non-standard .libPaths are in use

Open dfifield opened this issue 9 months ago • 2 comments

Hi kaskr,

Thanks for the great tmbstan package!

I've run into an issue running chains on multiple cores whereby the worker nodes can't find the tmbstan library (or any of it's dependencies). This happens when a user doesn't keep their R libraries in the standard location (e.g., .../R/R-4.4.2/library), but instead uses a custom library folder setup via .libPaths().

In this case, when each worker node Rscript.exe tries to execute the library(tmbstan) command from the tmpfile that you create at line 75 in R/tmbstan.R, it will fail to find the library and the worker Rscript.exe will exit. This will cause parallel::makeCluster() to hang when it tries to create the cluster.

In order to fix this, the code written to tmpfile needs to take the current .libPaths() into account.

Suggest changing R/tmbstan.R, line 76 from:

cat("library(tmbstan)\n", file=tmpfile)

to:

cat(".libPaths(c(", toString(dQuote(.libPaths(), q = FALSE)), "))\n", file = tmpfile)
cat("library(tmbstan)\n", file = tmpfile, append = TRUE)

With this change in place, when I run the following test:

library(TMB)
library(tmbstan)
runExample("simple")
cores <- 4
options(mc.cores = cores)
init.fn <- function()
  list(u=rnorm(114), beta=rnorm(2), logsdu=runif(1,0,10), logsd0=runif(1,0,1))
fit <- tmbstan(obj, chains=cores, open_progress=FALSE, init=init.fn)

The tmpfile will contain:

.libPaths(c( "C:/Users/bakerk/Documents/R/Rlibs", "C:/Users/bakerk/AppData/Local/Programs/R/R-4.4.2/library" ))
library(tmbstan)
dyn.load('C:/Users/bakerk/Documents/R/Rlibs/TMB/examples/x64/simple.dll')

and the worker Rscripts can now find the tmbstan library (and all its dependencies).

Thanks! Dave

dfifield avatar Feb 01 '25 22:02 dfifield

@dfifield Now changed - please confirm that it works. Thanks!

kaskr avatar Feb 03 '25 09:02 kaskr

Thanks so much!

I'm travelling this week, but will test when I get back.

Best, Dave

Dave Fifield Marine Wildlife and Ecosystem Conservation Specialist Wildlife Research Division, Science and Technology Branch Environment and Climate Change Canada / Government of Canada @.*** / Cell: 709-725-5301

Spécialiste de la conservation de la faune et de l'écosystème marin Division de la recherche sur la faune, Direction générale des sciences et technologie Environnement et Changement Climatique Canada / Gouvernement du Canada @.*** / Tél. cell. : 709-725-5301

@.***

From: kaskr @.> Sent: February 3, 2025 04:10 To: kaskr/tmbstan @.> Cc: Dave Fifield @.>; Mention @.> Subject: Re: [kaskr/tmbstan] Parallel chains - worker nodes can't find tmbstan library when non-standard .libPaths are in use (Issue #30)

@dfifieldhttps://github.com/dfifield Now changed - please confirm that it works. Thanks!

Reply to this email directly, view it on GitHubhttps://github.com/kaskr/tmbstan/issues/30#issuecomment-2630360284, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAPJOH6SYLFOA7NLXLGHJ3T2N4W5XAVCNFSM6AAAAABWJWKHJGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMZQGM3DAMRYGQ. You are receiving this because you were mentioned.Message ID: @.@.>>

dfifield avatar Feb 03 '25 13:02 dfifield