cmdstan icon indicating copy to clipboard operation
cmdstan copied to clipboard

Error compiling a simple model with read-only access to CmdStan

Open wlandau opened this issue 2 years ago • 12 comments

Summary:

I cannot compile Stan models if I only have read permissions to CmdStan.

Description:

I work in a large highly-regulated company with a centrally-maintained computing environment. Users are discouraged from installing their own software because of validation requirements. Our devops team installed a copy of CmdStan 2.32.2, and I cannot compile models with it because I only have read permissions.

Reproducible Steps:

With an instance of CmdStan installed by a different user, a simple make ~/bernoulli fails with the output below, where bernoulli.stan comes from the example files in CmdStan.

Current Output:

The full output is too long to include in this thread, but all the relevant output is at the beginning:

hpc /CENSORED_DIRECTORY/apps/cmdstan/cmdstan-2.32.2 make /home/CENSORED_USER/bernoulli
g++ -std=c++1y -pthread -D_REENTRANT -Wno-sign-compare -Wno-ignored-attributes      -I stan/lib/stan_math/lib/tbb_2020.3/include    -O3 -I src -I stan/src -I stan/lib/rapidjson_1.1.0/ -I lib/CLI11-1.9.1/ -I stan/lib/stan_math/ -I stan/lib/stan_math/lib/eigen_3.4.0 -I stan/lib/stan_math/lib/boost_1.78.0 -I stan/lib/stan_math/lib/sundials_6.1.1/include -I stan/lib/stan_math/lib/sundials_6.1.1/src/sundials    -DBOOST_DISABLE_ASSERTS          -c -MT stan/src/stan/model/model_header.hpp.gch -MT stan/src/stan/model/model_header.d -MM -E -MG -MP -MF stan/src/stan/model/model_header.d stan/src/stan/model/model_header.hpp
<built-in>:0:0: fatal error: opening dependency file stan/src/stan/model/model_header.d: Permission denied
compilation terminated.
g++ -std=c++1y -pthread -D_REENTRANT -Wno-sign-compare -Wno-ignored-attributes      -I stan/lib/stan_math/lib/tbb_2020.3/include    -O3 -I src -I stan/src -I stan/lib/rapidjson_1.1.0/ -I lib/CLI11-1.9.1/ -I stan/lib/stan_math/ -I stan/lib/stan_math/lib/eigen_3.4.0 -I stan/lib/stan_math/lib/boost_1.78.0 -I stan/lib/stan_math/lib/sundials_6.1.1/include -I stan/lib/stan_math/lib/sundials_6.1.1/src/sundials    -DBOOST_DISABLE_ASSERTS          -c -MT src/cmdstan/main.o -MM -E -MG -MP -MF src/cmdstan/main.d src/cmdstan/main.cpp
<built-in>:0:0: fatal error: opening dependency file src/cmdstan/main.d: Permission denied
compilation terminated.\

Expected Output:

I expected the model to compile successfully.

Current Version:

v2.32.2

wlandau avatar Jul 21 '23 18:07 wlandau

This makes me wonder about thread safety: is it safe to compile model files concurrently using the same instance of CmdStan, or would there be a race condition trying to modify the same files?

wlandau avatar Jul 21 '23 18:07 wlandau

I believe the only time cmdstan would write files is the first time a compilation is performed (assuming none of the source files are changed)

WardBrian avatar Jul 21 '23 18:07 WardBrian

In that case, would you recommend I ask my devops colleagues to run 'make examples/bernoulli/bernoulli' as part of the installation script?

wlandau avatar Jul 21 '23 18:07 wlandau

That might help, but I'm not 100% sure. Our Makefiles definitely aren't set up with read-only setups in mind

WardBrian avatar Jul 21 '23 18:07 WardBrian

An update: I tried the same test on a copy of CmdStan which had already compiled a model, and it failed with the same error.

wlandau avatar Aug 14 '23 17:08 wlandau

I have been working with my devops colleagues internally to test a centralized install of CmdStan via instantiate. We found that a basic test package that depends on instantiate/CmdStan recompiled the sundials libraries. That makes me wonder if it is safe to compile two models in two different processes simultaneously on the same copy of CmdStan, or if there would be a race condition in some cases if two processes are trying to write to the same files at the same time.

wlandau avatar Aug 18 '23 12:08 wlandau

I have been working with my devops colleagues internally to test a centralized install of CmdStan via instantiate. We found that a basic test package that depends on instantiate/CmdStan recompiled the sundials libraries.

My colleague (@jekriske-lilly) pointed out these lines which modify sundials:

https://github.com/stan-dev/math/blob/develop/make/libraries#L66-L80

wlandau avatar Aug 18 '23 12:08 wlandau

Depending by what you mean by centralized, this is likely a use case which would be ill supported by the current CmdStan build process. Because the build is entirely in-tree, it is difficult to make that work in a shared computing environment where e.g. different nodes would have different architectures. When using CmdStan in a cluster environment I generally keep a local checkout that I clean and re-build on the worker node

Something like e.g. cmake could be one solution to this (not without its own problems and headaches), but given the current way CmdStan builds I would consider the cmdstan source to be closer to "source code for the program I am building" than "a library like MKL I can use broadly".

WardBrian avatar Aug 18 '23 13:08 WardBrian

given the current way CmdStan builds I would consider the cmdstan source to be closer to "source code for the program I am building" than "a library like MKL I can use broadly".

On this idea, I tried to see if I could compile a Stan model with a temporary copy of CmdStan (see below). It looks like CmdStan has both dynamic/shared libraries and the source code for the program currently compiling. If it had only the former or only the latter, then compilation on shared systems would be much easier. In the former case, I could have instantiate install a temporary copy of CmdStan inside tempdir() which gets deleted when the current R session closes.

temp <- tempfile()
fs::dir_create(temp)
cmdstanr::install_cmdstan(dir = temp)
cmdstanr::cmdstan_path()
#> [1] "/var/folders/4v/vh7xp8553lsbl49svl48g7p00000gp/T/RtmpGaahT0/file115b469395fee/cmdstan-2.32.2"
upstream <- file.path(
  cmdstanr::cmdstan_path(),
  "examples",
  "bernoulli",
  "bernoulli.stan"
)
file.copy(upstream, "model.stan")
cmdstanr::cmdstan_model("model.stan")
unlink(cmdstanr::cmdstan_path(), TRUE)
rstudioapi::restartSession()

cmdstanr::cmdstan_path()
#> [1] "/Users/CENSORED/.cmdstan/cmdstan-2.31.0"
model <- cmdstanr::cmdstan_model(exe_file = "model", compile = FALSE)
model$sample(data = list(y = c(0, 0, 1, 0, 1)))
#> Chain 1 dyld[80398]: Library not loaded: @rpath/libtbb.dylib
#> Chain 1   Referenced from: <BB67563E-5FD5-32DA-8442-CECE9216941D> /Users/CENSORED/Desktop/model
#> Chain 1   Reason: tried: '/private/var/folders/4v/vh7xp8553lsbl49svl48g7p00000gp/T/RtmpGaahT0/file115b469395fee/cmdstan-2.32.2/stan/lib/stan_math/lib/tbb/libtbb.dylib' (no such file)
#> ...

wlandau avatar Aug 28 '23 17:08 wlandau

If your shared environment has TBB installed, you can ask CmdStan to use that instead using something like this in make/local (this is probably under-documented)

TBB_CXX_TYPE=gcc
TBB_INTERFACE_NEW=true # if using TBB newer than 2020
TBB_INC=$(TBBROOT)/include/
TBB_LIB=$(TBBROOT)/lib/

My work's shared computing environment uses the module system, so the above is what I use after module load gcc/11.2.0 intel-oneapi-tbb, which sets $TBBROOT.

We also do this when packaging cmdstan for Conda

WardBrian avatar Aug 28 '23 18:08 WardBrian

Interesting. Is TBB the only shared library I would need to handle this way?

wlandau avatar Aug 28 '23 18:08 wlandau

Yes, SUNDIALS is statically linked by default in CmdStan. You could do something similar if you wanted to also use a vendor-provided version of SUNDIALs, though (see https://github.com/stan-dev/math/pull/2861)

WardBrian avatar Aug 28 '23 18:08 WardBrian