rstanarm
rstanarm copied to clipboard
Use simpler regularize horseshoe parameterization
Summary:
There are different regularized horseshoe parameterizations. Two of them with Stan code are presented in Appendix C of https://projecteuclid.org/euclid.ejs/1513306866
The current rstanarm code is using parameterization shown in Appendix C.2. This parameterization was chosen as it seemed to be useful for non-regularized horseshoe.
Appendix C.1 has a simpler parameterization which has D+1 less parameters where D is the number of regression coefficients. This simpler parameterization is faster as the posterior dimensionality is smaller. We also now have lot of experimental experience that it works well with regularized horseshoe and thus it would be good to switch to the simpler parameterization.
@paul-buerkner reports for brms paul-buerkner/brms#873
With the ovarian cancer data, inference is about 15% faster on average with the C.1 parameterization on my machine and shows similar convergence in terms of divergent transition behavior, Rhat and ESS.
ovarian data has n=54, p=1536
The difference is smaller than I expected, but it's likely that the difference is bigger for even bigger datasets as they will be more memory cache misses.
I think we need more tests though, possibly with much more chains than four to get more reliable estimates. There is significant between chains difference in speed and so the 15% may be a quite unreliable estimate.
I've been running ovarian and related prostate with 10 chains