lifelines icon indicating copy to clipboard operation
lifelines copied to clipboard

different result from R(flexsurvreg package) estimation and Python(lifelines)

Open aflatouniana opened this issue 4 years ago • 1 comments

Hi ,

I am trying to replicate estimation results from a fit of AFT(loglogistic dist) In R(flexsurvreg package) with lifelines LogLogisticAFTFitter(); here are the calls: R : loglogistic <- flexsurvreg(Surv(spell,event) ~ x, dist="llogis") Python : LogLogisticAFTFitter._scipy_fit_method = 'BFGS' model_BFGS = LogLogisticAFTFitter() To do the comparison, Ive used a standard data so I could share it in order to address the issue : https://docs.google.com/file/d/0BwogTI8d6EEiM2stZzdFNXdxM00 here is the model parameters: event_col = 'event' duration_col = 'spell' variables = [ 'logwage', 'ui', 'age'] EQUAL x in R

So the thing that I've tried to change so far is fit_method as I realized in R the default is 'BFGS'. It seems that also intial points are different; however, I couldnt understand how they are defined differently! The results are slightly different, but for me decimals matters. Do you have clear idea of what are the possible default options that cause these differences?

Thanks :)

aflatouniana avatar Apr 13 '21 11:04 aflatouniana

Hi @aflatouniana, I was able to replicate the results locally:

Python:

ll = LogLogisticAFTFitter()
ll.fit(df, "spell", "event", formula="logwage + ui + age")
ll.print_summary(decimals=4, columns=["coef", "exp(coef)"])

"""
<lifelines.LogLogisticAFTFitter: fitted with 3343 total observations, 2270 right-censored observations>
             duration col = 'spell'
                event col = 'event'
   number of observations = 3343
number of events observed = 1073
           log-likelihood = -4014.1116
         time fit was run = 2021-04-13 14:57:13 UTC

---
                    coef  exp(coef)
param  covariate
alpha_ Intercept  4.0336    56.4631
       age        0.0107     1.0107
       logwage   -0.4613     0.6305
       ui         1.2083     3.3477
beta_  Intercept  0.3071     1.3595
---
Concordance = 0.6933
AIC = 8038.2232
log-likelihood ratio test = 435.7020 on 3 df
-log2(p) of ll-ratio test = 310.2314
"""

R

library(flexsurv)
fit <-  flexsurvreg(Surv(spell,event) ~ logwage + ui + age, dist="llogis", data=df)

flexsurvreg(formula = Surv(spell, event) ~ logwage + ui + age, 
    data = df, dist = "llogis")

Estimates: 
         data mean  est        L95%       U95%       se         exp(est)   L95%       U95%     
shape           NA    1.35845    1.29508    1.42492    0.03311         NA         NA         NA
scale           NA   56.88501   30.82689  104.97020   17.78109         NA         NA         NA
logwage    5.69299   -0.46205   -0.57284   -0.35125    0.05653    0.62999    0.56392    0.70380
ui         0.55280    1.21015    1.09373    1.32656    0.05940    3.35397    2.98540    3.76805
age       35.44331    0.01057    0.00486    0.01628    0.00291    1.01063    1.00487    1.01642

N = 3343,  Events: 1073,  Censored: 2270
Total time at risk: 20887
Log-likelihood = -4014.11, df = 5
AIC = 8038.221

To confirm: Does this look like what you get?

So I would first argue that these results are similar enough to be fully useful. That is, the difference you see between coefficients is orders of magnitude less than the std. error.

However, you mention that decimals matter. May I ask why? If I had to give a reason why these differences exist, it's likely due to stopping conditions in the fitting algorithm. For example, different thresholds of when the optimizers no longer making a descent.

CamDavidsonPilon avatar Apr 13 '21 15:04 CamDavidsonPilon