msm
msm copied to clipboard
False convergence - Hessian is not positive definite
Dear Prof. Jackson,
Thank you very much for providing such a great package on multi-state models. I have been running into the issue of false convergence on my dataset, where the error "Optimisation has probably not converged to the maximum likelihood - Hessian is not positive definite." is printed. I have read the other issues (#5 and #26) that also discuss this issue, and have tried out various methods:
- setting
fnscale
to the -2 log likelihood of the model that converged with false convergence, or even up to 5 000 000 - setting
maxit
to higher values (50 000, and even 100 000) - tightening
reltol
(up to 1e-16) - fixing parameters with
fixedpars
of all rows or columns of the transition intensity matrix
I still encounter this issue, and am running out of ideas. Would you have anything else in mind to try out?
Any help would be appreciated!
Looking at that qmatrix
and ematrix
, that model will not be identifiable from the data. You are allowing misclassification between all states, and instantaneous transitions between all states. There will be no way to tell from the data which combinations of parameter values are more plausible. The likelihood will be a flat function of the parameters, so that the optimum doesn't exist.
I'd advise to consider the structure of the process that you are modelling, and choose the allowed transitions and misclassifications more carefully. These are continuous-time Markov models - so even a transition between a pair of states can happen over an interval, it may not be possible in continuous time - see the course notes. Also make sure the observation scheme is set appropriately (through the obstype
or exacttimes
arguments) - do you know when the transitions happen in continuous time?
Don't give up with using fixed parameter values - e.g. if you must model misclassification, then this may only be possible with fixed misclassification probabilities.
Dear Prof. Jackson,
Thanks so much for your suggestions, it really seems like I might have allowed too many possibilities for the model.
Could you explain and give me an example of how to specify fixedpars
for the misclassification probabilities? In the documentation it says:
" These are given in the order: transition intensities (reading across rows of the transition matrix), covariates on intensities (ordered by intensities within covariates), hidden Markov model parameters, including misclassification probabilities or parameters of HMM outcome distributions (ordered by parameters within states), hidden Markov model covariate parameters (ordered by covariates within parameters within states), initial state occupancy probabilities (excluding the first probability, which is fixed at one minus the sum of the others)."
If I have 4 possible states, how should the misclassification probabilities be specified such that not the transition intensities are fixed instead? And do they change in case I use covariates?
Thanks a lot in advance!
Can I explain in the simplest case with a 2-state model? If you have, say qmatrix = rbind(c(0, 0.01), c(1, 0.02))
, and ematrix = rbind(c(0, 0.2), c(0.1, 0))
, there are 4 parameters in the model. These parameters are given labels of 1, 2, 3, 4 to refer to them in fixedpars
.
So to fix both misclassification probabilities, but not the intensities, specify fixedpars = c(3, 4)
. Or to fix the first misclassification probability at 0.2, but estimate the second one (starting from an initial value of 0.1), it will be just fixedpars=3
.
Does the general idea behind the text in the help file make more sense now? This text explains the order that the parameters are in when referring to them as 1,2,3,4,...
Thank you for your explanation, that helped a lot. I was able - after a bit of trial and error - to set the parameters accordingly (fixing two of the misclassification parameters) such that the models converged. I also restricted my initial qmatrix with the allowed transitions a bit more, as well as the misclassification matrix. All together helped, so thanks a lot!
For anyone else reading this post: I want to give you an example for the fixedpars
arguments to showcase how it would work with covariates - as far as I understood it. Please correct me if I am wrong!
qmatrix <- rbind(
c(0, 0.1, 0, 0),
c(0.1, 0, 0.1, 0),
c(0.1, 0.1, 0, 0.1),
c(0.1, 0.1, 0, 0)
)
- qmatrix (4x4) with 8 parameters --> indices 1:8 for
fixedpars
would fix these qmatrix entries
covariates <- "~age + sex + bmi"
- 3 covariates, on 8 parameters from the qmatrix (3*8)--> indices 9:32 for
fixedpars
would fix the covariates on the intensities
ematrix <- rbind(
c(0, 0.1, 0, 0),
c(0.1, 0, 0.1, 0.1),
c(0, 0.1, 0, 0.1),
c(0, 0, 0, 0)
)
- misclassification matrix (4x4) with 6 parameters --> indices 33:38 would fix the misclassification rates
Cheers!