msm icon indicating copy to clipboard operation
msm copied to clipboard

False convergence - Hessian is not positive definite

Open sophiakrix opened this issue 10 months ago • 4 comments

Dear Prof. Jackson,

Thank you very much for providing such a great package on multi-state models. I have been running into the issue of false convergence on my dataset, where the error "Optimisation has probably not converged to the maximum likelihood - Hessian is not positive definite." is printed. I have read the other issues (#5 and #26) that also discuss this issue, and have tried out various methods:

  • setting fnscale to the -2 log likelihood of the model that converged with false convergence, or even up to 5 000 000
  • setting maxit to higher values (50 000, and even 100 000)
  • tightening reltol (up to 1e-16)
  • fixing parameters with fixedpars of all rows or columns of the transition intensity matrix

I still encounter this issue, and am running out of ideas. Would you have anything else in mind to try out?

Any help would be appreciated!

sophiakrix avatar Apr 23 '24 15:04 sophiakrix

Here is the code with a sample file that I used:

github_issue_file.txt sample_df.csv

sophiakrix avatar Apr 23 '24 15:04 sophiakrix

Looking at that qmatrix and ematrix, that model will not be identifiable from the data. You are allowing misclassification between all states, and instantaneous transitions between all states. There will be no way to tell from the data which combinations of parameter values are more plausible. The likelihood will be a flat function of the parameters, so that the optimum doesn't exist.

I'd advise to consider the structure of the process that you are modelling, and choose the allowed transitions and misclassifications more carefully. These are continuous-time Markov models - so even a transition between a pair of states can happen over an interval, it may not be possible in continuous time - see the course notes. Also make sure the observation scheme is set appropriately (through the obstype or exacttimes arguments) - do you know when the transitions happen in continuous time?

Don't give up with using fixed parameter values - e.g. if you must model misclassification, then this may only be possible with fixed misclassification probabilities.

chjackson avatar Apr 23 '24 15:04 chjackson

Dear Prof. Jackson,

Thanks so much for your suggestions, it really seems like I might have allowed too many possibilities for the model.

Could you explain and give me an example of how to specify fixedpars for the misclassification probabilities? In the documentation it says:

" These are given in the order: transition intensities (reading across rows of the transition matrix), covariates on intensities (ordered by intensities within covariates), hidden Markov model parameters, including misclassification probabilities or parameters of HMM outcome distributions (ordered by parameters within states), hidden Markov model covariate parameters (ordered by covariates within parameters within states), initial state occupancy probabilities (excluding the first probability, which is fixed at one minus the sum of the others)."

If I have 4 possible states, how should the misclassification probabilities be specified such that not the transition intensities are fixed instead? And do they change in case I use covariates?

Thanks a lot in advance!

sophiakrix avatar Apr 24 '24 10:04 sophiakrix

Can I explain in the simplest case with a 2-state model? If you have, say qmatrix = rbind(c(0, 0.01), c(1, 0.02)), and ematrix = rbind(c(0, 0.2), c(0.1, 0)), there are 4 parameters in the model. These parameters are given labels of 1, 2, 3, 4 to refer to them in fixedpars.

So to fix both misclassification probabilities, but not the intensities, specify fixedpars = c(3, 4). Or to fix the first misclassification probability at 0.2, but estimate the second one (starting from an initial value of 0.1), it will be just fixedpars=3.

Does the general idea behind the text in the help file make more sense now? This text explains the order that the parameters are in when referring to them as 1,2,3,4,...

chjackson avatar Apr 24 '24 11:04 chjackson

Thank you for your explanation, that helped a lot. I was able - after a bit of trial and error - to set the parameters accordingly (fixing two of the misclassification parameters) such that the models converged. I also restricted my initial qmatrix with the allowed transitions a bit more, as well as the misclassification matrix. All together helped, so thanks a lot!

For anyone else reading this post: I want to give you an example for the fixedpars arguments to showcase how it would work with covariates - as far as I understood it. Please correct me if I am wrong!

qmatrix <- rbind(
  c(0, 0.1, 0, 0), 
  c(0.1, 0, 0.1, 0),
  c(0.1, 0.1, 0, 0.1),
  c(0.1, 0.1, 0, 0)
)
  • qmatrix (4x4) with 8 parameters --> indices 1:8 for fixedpars would fix these qmatrix entries
covariates <- "~age + sex + bmi"
  • 3 covariates, on 8 parameters from the qmatrix (3*8)--> indices 9:32 for fixedpars would fix the covariates on the intensities
    ematrix <- rbind(
        c(0, 0.1, 0, 0), 
        c(0.1, 0, 0.1, 0.1),
        c(0, 0.1, 0, 0.1),
        c(0, 0, 0, 0)
    ) 
  • misclassification matrix (4x4) with 6 parameters --> indices 33:38 would fix the misclassification rates

Cheers!

sophiakrix avatar May 06 '24 09:05 sophiakrix