PyMC3 sampling fails when the starting point lies at the optimization boundary
PyMC3 by default remaps bounded random variables to variables defined on the whole real line, "in order to sample models more efficiently". See, e.g.,
https://docs.pymc.io/notebooks/api_quickstart.html#Automatic-transforms-of-bounded-RVs
https://docs.pymc.io/api/bounds.html
Thus, when the starting point for the chain has a component lying on the boundary of the corresponding optimization interval, it will be transformed to ±∞ and the log-posterior will also be incorrect, leading to a SamplingError: Bad initial energy.
Possible solutions:
- Turn off automatic transform, e.g. by using
Whether this has indeed an effect on the efficiency of the sampling has to be seen. While it corrects the log-posterior of the initial point, it may fail on later samples with the same "bad energy" error (the sampling step brings the parameter values outside the domain?)with pm.Model() as model: x = pm.Uniform('x', lower=0, upper=1, transform=None) - Move components lying at the boundary to points near the boundary.
After fixing a bug in
pymc3, just taking the previous/successive floating point number appears to be working - Enlarge the bound interval. Since in some situations this may lead to inadmissible values of the parameters, it cannot be done automatically in pyPESTO but should be left to the user.
This problem is currently worked on in pull request #360.
I think 1 and 2 both make sense. If the trafo has an efficiency gain, it would be good to keep it by default. If parameters lie on the boundaries, in that case a little permutation would be ideal I think. In addition, we could add a flag transform=True to the sampler args.
I also think that 1 & 2 are good choices. Would be good too make this options easily accessible on the pyPESTO side.