lifelines icon indicating copy to clipboard operation
lifelines copied to clipboard

Handling of observations with birth_time==death_time

Open user799595 opened this issue 1 year ago • 1 comments

kmf = lifelines.KaplanMeierFitter()
kmf.fit([1, 2], event_observed=[1, 0], entry=[1, 0])
print(kmf.survival_function_)

Expected:

          KM_estimate
timeline             
0.0               1.0
1.0               0.5
2.0               0.5

Actual:

          KM_estimate
timeline             
0.0               1.0
1.0               0.0
2.0               0.0

I've read https://github.com/CamDavidsonPilon/lifelines/issues/497 and the corresponding comments

        # Why subtract entrants like this? see https://github.com/CamDavidsonPilon/lifelines/issues/497
        # specifically, we kill people, compute the ratio, and then "add" the entrants.
        # This can cause a problem if there are late entrants that enter but population=0, as
        # then we have log(0 - 0). We later ffill to fix this.
        # The only exception to this rule is the first period, where entrants happen _prior_ to deaths.

But I can't wrap my head around what this is saying. How could entrants not happen prior to deaths? If I have an observation with birth_time==death_time does that mean that it died before it was born?

I thought that the likelihood is

  • $P(T = d | T \ge b)$ for observed events
  • $P(T > d | T \ge b)$ for unobserved events

user799595 avatar Apr 11 '24 21:04 user799595

This is an interesting issue, and I want to agree with your expected case. However, I'm also inclined to reject the case birth_time==death_time as pathological to lifelines. Based on that highlighted comment, it sounds like birth_times is actually birth_time + \epsilon. So if you want a true birth_time==death_time, you would add an epsilon to the death time:

kmf = lifelines.KaplanMeierFitter()
kmf.fit([1+1e-10, 2], event_observed=[1, 0], entry=[1, 0])
print(kmf.survival_function_)
          KM_estimate
timeline
0.0               1.0
1.0               1.0
1.0               0.5
2.0               0.5

This is terrible and not at all how I expect users to fix this. I'll have to think more about this.

CamDavidsonPilon avatar Jun 26 '24 02:06 CamDavidsonPilon