stan
stan copied to clipboard
transition method bugged for softabs
Summary:
transition method results in nans for softabs metric
Description:
I might use the library in a wrong way but I'll try to explain everything. Firstly, I use stan as a C++ library and my models have simple interface - method which calculates -log(prob) and method which calculates -log(prob) and gradient alongside ( hand coded first derivatives in models hierarchy ).
I actually made stan fully work with my models ( thx to templates ) and diagonal metric works nearly well ( for some cases stepsize is too high and I'm trying softabs ).
this->seed(init_sample.cont_params()); this->hamiltonian_.sample_p(this->z_, this->rand_int_); this->hamiltonian_.init(this->z_, logger);
This think in general it looks wrong since for softabs metric sample_p we actually need it to be fully initialized. Why hamiltonian_.init is called after p sampling?
Something like this fixes it ( global_empty_logger is just a static instance of stan::callbacks::logger ) mcmc_.seed(q_init); mcmc_.init_hamiltonian(global_empty_logger); It forces complete hamiltonian initialization since softabs has a member Eigen::SelfAdjointEigenSolverEigen::MatrixXd eigen_deco which is constructed as eigen_deco(n), but before a eigen_deco.eigenvectors() call it needs to be initialized as z.eigen_deco.compute(z.hessian); ( z.hessian is initialized properly for default case - identity matrix, but decomposition is not computed.
So there are 2 solutions.
- place decomposition into softabs_point constructor
- swap this->hamiltonian_.sample_p(this->z_, this->rand_int_); this->hamiltonian_.init(this->z_, logger);
Second one is actually weird because code in base_[nuts/hmc/...] is duplicated ( and may need to be refactored.
Off topic:
If you have any suggestions for epsilon hacks feel free to suggest because I actually encountered that problem of huge epsilon when samplers always overstep local minimum and samples are biased. Currently, dense metric messes all up for kind of a simple model ( cannot reveal any code and model structure ). Diagonal metric with adaptive stepsize is +/- works for hmc/nuts and kind of much better for xhmc. Nearly all parameters are used as default for hmc/nuts/xhmc. + hmc is a bit tunned (before every transition call there is another mcmc_.set_nominal_stepsize_and_L(mcmc_.get_nominal_stepsize(), pars.steps) call because constant travel "distance" T looks bad for models except very simple.
Data for this model is: den = np.sqrt(1 + a02 + 2*b02) a = a0/den b = b0/den y = ax + b(x**2-1) + e/den where a0, b0 - floats x, e - n samples from N(0,1) ( I have a c++ python wrappers for easier testing )
Current Version:
v2.17.1