cmdstan icon indicating copy to clipboard operation
cmdstan copied to clipboard

CmdStan on IBM Power-9 systems

Open mjcarter95 opened this issue 4 years ago • 14 comments

Summary:

CmdStan fails to compile models on IBM Power-9 systems.

This issue was previously discussed in the following Stan discourse post, https://discourse.mc-stan.org/t/error-running-cmdstan-on-ibm-power9-system/22060. A step-by-step solution is given in the PDF document attached to the first post - possibly worth adding to the Stan documentation?

Description:

Trying to compile a Stan model on an IBM Power-9 system results in the following error:

make ../src/models/deaths_and_111_calls

--- Translating Stan model to C++ code ---
bin/stanc  --o=../src/models/deaths_and_111_calls.hpp ../src/models/deaths_and_111_calls.stan
/bin/bash: bin/stanc: cannot execute binary file
make: *** No rule to make target `../src/models/deaths_and_111_calls.hpp', needed by `../src/models/deaths_and_111_calls'.  Stop.

CmdStan is distributed with pre-compiled stanc binary files that were built for x86 architecture, not Power-9. Because of this, we must build and configure the Stan compiler by hand.

Reproducible Steps:

Compile any Stan model on an IBM Power-9 system.

Current Output:

See description,

Expected Output:

A compiled model.

Additional Information:

NA.

Current Version:

Initially v2.23.0, persists v2.27.0

mjcarter95 avatar Jul 17 '21 13:07 mjcarter95

Would this help https://github.com/stan-dev/stanc3/issues/926

the same issue is true for ARM based processors

This is false. For ARM we have a separate tarball in the last 2 releases.

rok-cesnovar avatar Jul 17 '21 14:07 rok-cesnovar

Would this help stan-dev/stanc3#926

the same issue is true for ARM based processors

This is false. For ARM we have a separate tarball in the last 2 releases.

My bad, updated post.

mjcarter95 avatar Jul 17 '21 14:07 mjcarter95

Yep, as mentioned in the linked post I'll be updating our release process to automatically build binaries for powerpc systems, so this will be handled automatically soon

andrjohns avatar Jul 17 '21 15:07 andrjohns

@mjcarter95 powerpc support should be available now. You can either clone the develop branch of the cmdstan repo and build, or you can just download the stanc binary from the nightly release in the stanc3 repo

andrjohns avatar Aug 21 '21 01:08 andrjohns

@andrjohns Thank you. Just got round to trying this: I pulled the develop branch of cmdStan and am now able to compile both cmdStan and Stan models. However, when running the model "Segmentation fault" is output to the terminal with no additional information. Any ideas what the cause might be?

I am using gcc 10.2.0 and the model outlined here

./src/models/deaths_and_111_calls sample num_samples=1000 num_warmup=1000 algorithm=hmc engine=nuts max_depth=10 stepsize=0.01 adapt delta=0.8 data file=data/model_input/nhs_sheffield_ccg.data.json init=data/model_inits/nhs_sheffield_ccg/init1.json output file=output/model_fits/bede/samples1.csv
method = sample (Default)
  sample
    num_samples = 1000 (Default)
    num_warmup = 1000 (Default)
    save_warmup = 0 (Default)
    thin = 1 (Default)
    adapt
      engaged = 1 (Default)
      gamma = 0.050000000000000003 (Default)
      delta = 0.80000000000000004 (Default)
      kappa = 0.75 (Default)
      t0 = 10 (Default)
      init_buffer = 75 (Default)
      term_buffer = 50 (Default)
      window = 25 (Default)
    algorithm = hmc (Default)
      hmc
        engine = nuts (Default)
          nuts
            max_depth = 10 (Default)
        metric = diag_e (Default)
        metric_file =  (Default)
        stepsize = 0.01
        stepsize_jitter = 0 (Default)
    num_chains = 1 (Default)
id = 1 (Default)
data
  file = data/model_input/nhs_sheffield_ccg.data.json
init = data/model_inits/nhs_sheffield_ccg/init1.json
random
  seed = 2776634896 (Default)
output
  file = output/model_fits/bede/samples1.csv
  diagnostic_file =  (Default)
  refresh = 100 (Default)
  sig_figs = -1 (Default)
  profile_file = profile.csv (Default)
num_threads = 1 (Default)

Segmentation fault

mjcarter95 avatar Aug 31 '21 22:08 mjcarter95

Are you able to compile and run the bernoulli example model thats included with cmdstan?

andrjohns avatar Aug 31 '21 23:08 andrjohns

Yes, the bernoulli example compiles and runs.

mjcarter95 avatar Sep 01 '21 09:09 mjcarter95

That's a good sign, at least. Can you share your model code and the .hpp that is generated by stan? That way I can check that the same c++ is being generated across systems.

Im assuming that the model compiled and sampled under the stanc3 that you built locally? Can you try using the stanc binary that you built previously, but with the cmdstan that you just cloned? So we can check whether the segfault is due to stanc3 or due to changes in cmdstan.

andrjohns avatar Sep 01 '21 12:09 andrjohns

I've uploaded the model code and .hpp to Google Drive, hopefully you can access them here, please let me know if you have any trouble accessing them.

I removed the stanc3 that was built locally, so the model should have been compiled using stanc3 that is referenced in the develop branch.

mjcarter95 avatar Sep 02 '21 12:09 mjcarter95

Alright, the generated .hpp is identical to what I get locally, so this might not be a stanc3 issue. Were you able to successfully run the model when using the locally-built stanc?

Can you share some data that reproduces the issue?

andrjohns avatar Sep 02 '21 13:09 andrjohns

Yes, I was able to run the model when using the locally built stanc (note that was coupled with the latest stable release of cmdStan and not the develop branch; I'll try this later this evening if I get chance).

I've uploaded the json data and inits to Google Drive shared earlier. Using the following to sample:

./src/models/deaths_and_111_calls sample num_samples=1000 num_warmup=1000 algorithm=hmc engine=nuts max_depth=10 stepsize=0.01 adapt delta=0.8 data file=data/model_input/nhs_sheffield_ccg.data.json init=data/model_inits/nhs_sheffield_ccg/init1.json output file=output/model_fits/bede/samples1.csv

mjcarter95 avatar Sep 02 '21 14:09 mjcarter95

Alrighty, I also get a segfault with that data, so this is a cmdstan issue not stanc3 (phew for the multiarch). I'll start digging into this now

andrjohns avatar Sep 02 '21 14:09 andrjohns

It looks like the segfault is related to your initial values, because the model runs fine when they're not included. Odd.

andrjohns avatar Sep 02 '21 14:09 andrjohns

@mjcarter95 was this issue resolved?

rok-cesnovar avatar Nov 22 '21 08:11 rok-cesnovar