pystan
pystan copied to clipboard
I ran into an issue, while trying to get started with stan/pystan.
I ran into the following errors and warnings, while trying to get started with stan/pystan using the documentation at https://pystan.readthedocs.io/en/latest/ This is what is there now:
import stan
schools_code = """
data {
int<lower=0> J; // number of schools
real y[J]; // estimated treatment effects
real<lower=0> sigma[J]; // standard error of effect estimates
}
parameters {
real mu; // population treatment effect
real<lower=0> tau; // standard deviation in treatment effects
vector[J] eta; // unscaled deviation from mu by school
}
transformed parameters {
vector[J] theta = mu + tau * eta; // school treatment effects
}
model {
target += normal_lpdf(eta | 0, 1); // prior log-density
target += normal_lpdf(y | theta, sigma); // log-likelihood
}
"""
schools_data = {"J": 8,
"y": [28, 8, -3, 7, -1, 1, 18, 12],
"sigma": [15, 10, 16, 11, 9, 11, 10, 18]}
posterior = stan.build(schools_code, data=schools_data)
fit = posterior.sample(num_chains=4, num_samples=1000)
eta = fit["eta"] # array with shape (8, 4000)
df = fit.to_frame() # pandas `DataFrame, requires pandas
Running the above, we get the following messages from the stan.build function:
Messages from stanc:
Warning in '/tmp/httpstan_yl4pxs0i/model_zzhabz4t.stan', line 4, column 2: Declaration
of arrays by placing brackets after a variable name is deprecated and
will be removed in Stan 2.32.0. Instead use the array keyword before the
type. This can be changed automatically using the auto-format flag to
stanc
Warning in '/tmp/httpstan_yl4pxs0i/model_zzhabz4t.stan', line 5, column 2: Declaration
of arrays by placing brackets after a variable name is deprecated and
will be removed in Stan 2.32.0. Instead use the array keyword before the
type. This can be changed automatically using the auto-format flag to
stanc
Warning: The parameter tau has no priors. This means either no prior is
provided, or the prior(s) depend on data variables. In the later case,
this may be a false positive.
Warning: The parameter mu has no priors. This means either no prior is
provided, or the prior(s) depend on data variables. In the later case,
this may be a false positive.
Then posterior.sample fails with:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
Cell In [3], line 1
----> 1 fit = posterior.sample(num_chains=4, num_samples=1000)
File ~/.local/lib/python3.10/site-packages/stan/model.py:89, in Model.sample(self, num_chains, **kwargs)
61 def sample(self, *, num_chains=4, **kwargs) -> stan.fit.Fit:
62 """Draw samples from the model.
63
64 Parameters in ``kwargs`` will be passed to the default sample function.
(...)
87
88 """
---> 89 return self.hmc_nuts_diag_e_adapt(num_chains=num_chains, **kwargs)
File ~/.local/lib/python3.10/site-packages/stan/model.py:108, in Model.hmc_nuts_diag_e_adapt(self, num_chains, **kwargs)
92 """Draw samples from the model using ``stan::services::sample::hmc_nuts_diag_e_adapt``.
93
94 Parameters in ``kwargs`` will be passed to the (Python wrapper of)
(...)
105
106 """
107 function = "stan::services::sample::hmc_nuts_diag_e_adapt"
--> 108 return self._create_fit(function=function, num_chains=num_chains, **kwargs)
File ~/.local/lib/python3.10/site-packages/stan/model.py:312, in Model._create_fit(self, function, num_chains, **kwargs)
309 return fit
311 try:
--> 312 return asyncio.run(go())
313 except KeyboardInterrupt:
314 return
File /usr/local/lib/python3.10/asyncio/runners.py:44, in run(main, debug)
42 if debug is not None:
43 loop.set_debug(debug)
---> 44 return loop.run_until_complete(main)
45 finally:
46 try:
File /usr/local/lib/python3.10/asyncio/base_events.py:646, in BaseEventLoop.run_until_complete(self, future)
643 if not future.done():
644 raise RuntimeError('Event loop stopped before Future completed.')
--> 646 return future.result()
File ~/.local/lib/python3.10/site-packages/stan/model.py:236, in Model._create_fit.<locals>.go()
234 sampling_output.write_line("<info>Sampling:</info> <error>Initialization failed.</error>")
235 raise RuntimeError("Initialization failed.")
--> 236 raise RuntimeError(message)
238 resp = await client.get(f"/{fit_name}")
239 if resp.status != 200:
RuntimeError: Exception during call to services function: `BrokenProcessPool('A child process terminated abruptly, the process pool is not usable anymore')`, traceback: `[' File "/home/knappa/.local/lib/python3.10/site-packages/httpstan/services_stub.py", line 112, in call\n future = asyncio.get_running_loop().run_in_executor(executor, lazy_function_wrapper_partial) # type: ignore\n', ' File "/usr/local/lib/python3.10/asyncio/base_events.py", line 818, in run_in_executor\n executor.submit(func, *args), loop=self)\n', ' File "/usr/local/lib/python3.10/concurrent/futures/process.py", line 715, in submit\n raise BrokenProcessPool(self._broken)\n']`
This is on a debian box (version=testing), running pystan 3.5.0, installed through pip.
Originally posted by @knappa in https://github.com/stan-dev/pystan/issues/354#issuecomment-1251231774
Trying this again on another box that has python 3.9 instead of 3.10, I also see this output:
Sampling: 0%
double free or corruption (out)
Sampling: 0% (2/8000)
right before the python trace.
Interesting. Thanks for the report. Haven't seen this before.
I assume you have plenty of memory, right? Compiling the model can require a few GB of memory.
We do not currently test on Debian testing. There could be some incompatibility with the newest gcc or libstdc++.
Yes, plenty of memory. Compiling seems fine (I'm assuming that's what the build method does), the error seems to be that the generated executable segfaults, since it crashes on the sampling step.
This guess is based on trying the same model with cmdstan, which outputs:
method = sample (Default)
sample
num_samples = 1000 (Default)
num_warmup = 1000 (Default)
save_warmup = 0 (Default)
thin = 1 (Default)
adapt
engaged = 1 (Default)
gamma = 0.050000000000000003 (Default)
delta = 0.80000000000000004 (Default)
kappa = 0.75 (Default)
t0 = 10 (Default)
init_buffer = 75 (Default)
term_buffer = 50 (Default)
window = 25 (Default)
algorithm = hmc (Default)
hmc
engine = nuts (Default)
nuts
max_depth = 10 (Default)
metric = diag_e (Default)
metric_file = (Default)
stepsize = 1 (Default)
stepsize_jitter = 0 (Default)
num_chains = 1 (Default)
id = 1 (Default)
data
file = getting-started/example.json
init = 2 (Default)
random
seed = 2272506336 (Default)
output
file = output.csv (Default)
diagnostic_file = (Default)
refresh = 100 (Default)
sig_figs = -1 (Default)
profile_file = profile.csv (Default)
log_prob_output_file = log_prob_output.csv (Default)
num_threads = 1 (Default)
Gradient evaluation took 4e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
Adjust your expectations accordingly!
Iteration: 1 / 2000 [ 0%] (Warmup)
Segmentation fault
I have been able to get some examples in cmdstan working, but not this one.
For reference, my compiler versions:
gcc (Debian 12.2.0-3) 12.2.0
Debian clang version 15.0.0-2
This might be a bug against Stan C++. If you post the segfault-causing cmdstan version on the Stan forums, I suspect this will get plenty of attention. https://discourse.mc-stan.org/
Thanks again for the report.