rstan
rstan copied to clipboard
After optimization Cholesky of Hessian can fail
Summary:
After optimization Cholesky of Hessian can fail. Since this likely due to a numerical inaccuracy, instead of error and fail, it would be better to add some jitter, give a warning and provide a result. When using the coming PSIS diagnostic and correction it doesn't matter if jitter causes extra variation in draws.
Reproducible Steps:
Data in http://www.stat.columbia.edu/~gelman/regression/
earnings_all <- read.csv("Earnings/data","earnings.csv")
earnings_all$positive <- earnings_all$earn > 0
# only non-zero earnings
earnings <- earnings_all[earnings_all$positive, ]
M_1 <- stan_glm(earn ~ height + male, data = earnings, algorithm="optimizing")
Current Output:
Optimization terminated normally:
Convergence detected: relative gradient magnitude is below tolerance
Error in chol.default(-H) :
the leading minor of order 4 is not positive definite
Error in out$theta_tilde[, mark] %*% t(R_inv) :
requires numeric/complex matrix/vector arguments
Expected Output:
Optimization terminated normally:
Convergence detected: relative gradient magnitude is below tolerance
Warning: Hessian is close to a singular (up to to the numerical accuracy). Addded 1-e10 jitter.
RStan Version:
The version of RStan you are running (e.g., from packageVersion("rstan")
)
[1] ‘2.17.3’
R Version:
The version of R you are running (e.g., from R.version.string
)
[1] "R version 3.4.4 (2018-03-15)"
After putting in stuff to deal with the diagonal, we still have the problem that the optimal value of the standard deviation is zero numerically (-1 million on the log scale). Thus, we get the error Exception: normal_rng: Scale parameter is 0, but must be > 0!
.
Did you end up finding a workaround? I have some datasets where this happens stochastically, i.e. I get the error in roughly 10% of cases, so considering restarting the optimizer multiple times, but just wondering whether there is a better way...