msm icon indicating copy to clipboard operation
msm copied to clipboard

numerical overflow in calculating likelihood

Open ghost opened this issue 8 years ago • 25 comments

sppb.msm <-msm(state_1 ~ days, subject = symbol, data = data,

  •            qmatrix = qmat0, death = 3, method = "BFGS", 
    
  •            control = list(fnscale = 4000, maxit = 10000),
    
  •            covariates = ~ PB)
    

Error in Ccall.msm(params, do.what = "lik", ...) : numerical overflow in calculating likelihood

ghost avatar Feb 17 '17 06:02 ghost

Again, a reproducible example might help (PB isn't in the data you sent me). Have you tried all the tricks in the "convergence failure" section of the PDF manual? Perhaps try rescaling the covariates to have SD around 1, or rescaling the time variable in months instead of years?

The error message is not helpful so I'll leave the issue open, but this kind of error message is common and really difficult to prevent in a general case.

chjackson avatar Feb 18 '17 22:02 chjackson

Have the same problem with different data sets. You can get a reproducible example even with provided cav data set:

> cavsex.msm <- msm(state ~ years, subject=PTNUM, data=cav, qmatrix=Q, deathexact=4, covariates = ~age)
Error in Ccall.msm(params, do.what = "lik", ...) : 
  numerical overflow in calculating likelihood

AlexKram avatar Jun 09 '18 08:06 AlexKram

Try setting control=list(fnscale=4000) in the call to msm. Since the -2 log likelihood is around 4000 in this example, this normalises the -2 log-likelihood during optimisation to a scale of around 1, so that overflow is less likely to occur.

chjackson avatar Jun 10 '18 20:06 chjackson

I have experimented with doing this rescaling automatically for all model fits, but it seemed to cause worse convergence for as many examples as it improved.

chjackson avatar Jun 10 '18 20:06 chjackson

I'm facing the same problem when I use continuos covariates. The fnscale option doesn't seem to work with the cav dataset or with others datasets.

hbg26 avatar Nov 02 '18 13:11 hbg26

Could you give a reproducible example please @hbg26

chjackson avatar Nov 02 '18 13:11 chjackson

Using the cav dataset:

cav.age.msm <- msm( state ~ years, subject=PTNUM, data = cav, qmatrix = Q, deathexact = 4, covariates = ~age, obstype = 1, control=list(fnscale=4000))

I tried using different values for fnscale and maxit but didn't work.

hbg26 avatar Nov 02 '18 13:11 hbg26

That works for me, using

Q<-rbind(c(0,0.25,0,0.25), c(0.166,0,0.166,0.166),c(0,0.25,0.25,0), c(0,0,0,0))

and the latest CRAN msm, R 3.4.4 on Linux. If it hadn't worked I would have tried rescaling the covariate:

age10 <- (cav$age - mean(cav$age))/10

chjackson avatar Nov 02 '18 14:11 chjackson

The code to set the allowable transitions has proven essential for me. Try playing with those numbers, e.g. change the values within Q<-rbind

28sophiesworld avatar Jan 19 '19 18:01 28sophiesworld

Great. The code of control=list(fnscale=4000) solved my problem of numerical overflow and control =list(reltol = 1e-16) solved false convergence.

seairis avatar Mar 01 '19 18:03 seairis

numerical overflow in calculating likelihood,,, How can i correct it?

Heart-wang avatar Dec 02 '19 08:12 Heart-wang

wsex.msm<-msm(state~years,subject=id,data=w,qmatrix=Q,covariates=~sex,method="BFGS",control=list(fnscale=4000,maxit = 10000)) Warning message: In msm(state ~ years, subject = id, data = w, qmatrix = Q, covariates = ~sex, : Optimisation has probably not converged to the maximum likelihood - Hessian is not positive definite.

Heart-wang avatar Dec 04 '19 08:12 Heart-wang

@Heart-wang A reproducible example would help others to give advice. Though please first try some of the tricks above in the thread, and in the "Convergence failure" section of the PDF manual

chjackson avatar Dec 04 '19 09:12 chjackson

When there are many states of the disease we are studying, such as the progress of the metabolic syndrome , eight states are set according to its diagnostic criteria. How should we set Q properly?

Heart-wang avatar Jan 07 '20 05:01 Heart-wang

The text file contains some code that generates synthetic data to reproduce the msm error Error in Ccall.msm(params, do.what = "lik", ...) : numerical overflow in calculating likelihood repex_msm_0verflow.txt

joseph-rickert avatar Jun 02 '20 18:06 joseph-rickert

Thanks for the reproducible example. The problem there is that the model is not identifiable. The consequence is that the optimiser is exploring a log-likelihood surface that is a flat function of the parameters. Therefore massive differences in the parameters will produce tiny differences in the likelihood. To try to find the maximum likelihood, it makes massive jumps in the parameter space. At the second iteration it visits log intensity values of 4000 - which results in the intensity overflowing, and the error message.

The example looks like it's trying to fit a discrete-time Markov model to discrete-time data. I'm not sure msm is generally appropriate for this because it fits continuous-time models, but if you must, then it's usually more appropriate to use exacttimes=TRUE, which indicates that all transitions occur at the observation times. If this is omitted, then it assumes the data are snapshots of a continuous-time process - that doesn't work here because there are a practically infinitely many continuous-time models that are equally compatible with the data. Particularly if all transitions are permitted in real time, then people can move back and forth around the states any number of times between the discrete-time observations.

I could add something to the error message, like "Perhaps the model is not identifiable - check, for example, that the transition structure and observation type are appropriate for the data". But I'm not sure whether this is the only (or the dominant) reason that this message would be triggered - if there are other reasons, then the message may be confusing. A wider range of examples would help to clarify this.

chjackson avatar Jun 03 '20 09:06 chjackson

thanks

Heart-wang avatar Jun 03 '20 10:06 Heart-wang

Hello Professor Jackson, Thank you for the quick reply. Yes, I can see that it is an identifiability problem. In fact, I have observed that constraining the transition matrix is helpful. It is not too hard to find a small, random, 3X3 birth-death process, transition matrix that msm() will fit for small synthetic data sets. For example, here is a TM and Q matrix pair that for 200 rows and 5 columns, msm() may produce results, or terminate with a warning that the algorithm may not have converged, but for which I have so far not observed the overflow error. Setting r to 2000 and c to 10 will generate the overflow error.

This behavior is curious. I did try to look through the msm code to see if there was a place to determine where things start to go South, but this is beyond me.

TM_1 <- matrix(c(0.5367896, 0.4632104, 0.0000000, 0.3617935, 0.2929464, 0.3452601, 0.0000000, 0.5953339, 0.4046661), nrow = 3, byrow = TRUE)

Q <- matrix(c(0,1,0,1,0,1,0,1,0),nrow=3,byrow=TRUE)

Also, it seems to be even easier to stumble onto small, random transition matrices with absorbing states that msm() can handle.

I did not set exacttimes=TRUE because I convinced myself that I was on to a simple way of simulating the Jump Chain of a continuous time process. I got this notion from a naive leap of faith after noticing that the methods used in the ctmcd package only need the raw transition counts or relative counts to drive them.

My current hypothesis as to why my random data sets are so different from the several real data sets you provide in the package, is that so far I have been working with overly simplistic transition times. However, I can find nothing in the theory of continuous chains to back this up.

Anyway, thank you for your work. msm is an amazing package and I think a real R treasure.

Best regards, Joe Rickert

Joseph B. Rickert RStudio R Community Ambassador R Consortium Director Cell: 408.489.0566

On Wed, Jun 3, 2020 at 3:17 AM Heart-wang [email protected] wrote:

thanks

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/chjackson/msm/issues/5#issuecomment-638102119, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAN22VWQOB73UVSMIXP3F3TRUYPMZANCNFSM4DAPX77A .

joseph-rickert avatar Jun 03 '20 22:06 joseph-rickert

I'm also running into the same issue. It is definitely an identifiability issue as Prof Jackson eloquently described above. I believe I can handle this in an ad hoc manner by collapsing states and reducing covariates. Also, the control=list(fnscale=#) has been very helpful. For my problem I used # = 200,000 base on my -2 * loglik scale.

I'm wondering 2 things:

  1. Could the identifiability issue be solved via a Bayesian approach? My hunch is that sparsity is less of an issue since in that case we're simply nudging priors if the transitions are rare, rather than trying to find a true maximum likelihood in a sparse parameter space. If so, are there packages available in R for Bayesian Markov modeling (Would prefer not to go to WINBUGs)

  2. Is this, in part, the motivation for the "misclassification" approach for hidden Markov models? By assuming misclassification, we can reduce the number of parameters, and therefore it is more likely this reduced set of parameters is identifiable. That is, the parameter space is shrunk and the maximum of the likelihood can be found.

alexocampo907 avatar Oct 06 '20 09:10 alexocampo907

I expect Bayesian approaches would help if the data are weak, but only if you have substantive information to go into the prior. I'm not aware of any package that does continuous-time Markov models. Though Stan is better than WinBUGS for implementing general models.

edited April 2024: there is now the msmbayes package which implements some Bayesian continuous-time multistate models, but is not as general as msm.

Misclassification models have more parameters than non-hidden Markov models..

chjackson avatar Oct 06 '20 09:10 chjackson

That makes sense. Also, (facepalm)... of course they have more parameters. Thanks!

alexocampo907 avatar Oct 06 '20 09:10 alexocampo907

Hi professor Jackson, I try to use msm package for purchasing data that can find behavior and my models is fitting for 7 states and I want to expand on 20 states, I stock in nstates=8 and I got a lot of error even when I increased number of individuals and number of iterations of person who purchasing. and get this error again and again. Also, Can you recommand me best way to find the initial values, I'm still struggling with that to find the best for each states.

nstates <- 8

Q8 <- matrix(c(rep(1/nstates, nstates^2)), byrow=TRUE, nrow=nstates)

hmodel8 <- list(hmmMV(hmmPois(0.04),hmmPois(0.02)), hmmMV(hmmPois(0.09),hmmPois(0.02)), hmmMV(hmmPois(0.01), hmmPois( 0.03)), hmmMV(hmmPois(0.05), hmmPois(0.02)), hmmMV(hmmPois(0.08), hmmPois(0.05)), hmmMV(hmmPois(0.04),hmmPois(0.01)), hmmMV(hmmPois(0.07),hmmPois(0.01)), hmmMV(hmmPois(0.59), hmmPois(0.04)))

model_poisson8 <- msm(cbind(floor(10Vegetables),floor(10Snacks)) ~ Duration,

  •                   subject = id, data = purchases.subset,
    
  •                   hmodel = hmodel8, qmatrix = Q8, control=list(fnscale=1000,maxit=10000))
    

Error in Ccall.msm(params, do.what = "lik", ...) : numerical overflow in calculating likelihood

HaniyehDanesh avatar Nov 15 '23 09:11 HaniyehDanesh

@HaniyehDanesh This model is almost certainly not identifiable from the data. You are allowing transitions between every state, assuming the state is intermittently observed, and assuming the states are misclassified. I guess you are confusing discrete and continuous time - see the course notes for an explanation of this issue.

Better to start with small, simple models - models that are small enough for you to understand how the data informs them, check the results make sense. Then build them up in stages. When you say your "model is fitting" for 7 states, have you checked that the parameter estimates are realistic? Often the optimiser will claim to converge, but the results are extremely large confidence intervals, hence the estimates are unreliable and the model is not useful.

chjackson avatar Nov 15 '23 09:11 chjackson

Thank you for answering professor.

I still misunderstanding about "I guess you are confusing discrete and continuous time", we have purchasing time based on month for example 1,...,24 (this is our time). You mean I should use stateble.msm() or changing my transition matrix and initial values?

HaniyehDanesh avatar Nov 15 '23 11:11 HaniyehDanesh

@HaniyehDanesh As I said, please work through the course notes to understand all this better. Consider what transitions in continuous time can happen, and consider what your observations tell you about the state of the process at each point in continuous time.

chjackson avatar Nov 15 '23 12:11 chjackson