lifelines icon indicating copy to clipboard operation
lifelines copied to clipboard

kmf.fit -> is_left_censoring?

Open GrowthJeff opened this issue 5 years ago • 1 comments

Hey there!

In #1181 I corrected the "code" used in the left censoring example to ensure that a CDF was plotted. In this example the reader is told, "Instead of producing a survival function, left-censored data analysis is more interested in the cumulative density function."

Since survival analysis seems like magic to me, with its ability to see beyond the veil (left or right censored data), I opened up kaplan_meier_fitter.py to see what's going on. I see this line of code, which suggests to me that the original intention of kmf.plot() on left censored data was to plot a cumulative density function.

# if the user is interested in left-censorship, we return the cumulative_density_, no survival_function_,
        is_left_censoring = CensoringType.is_left_censoring(self)
        primary_estimate_name = "survival_function_" if not is_left_censoring else "cumulative_density_"
        secondary_estimate_name = "cumulative_density_" if not is_left_censoring else "survival_function_"

I'm struggling to make sense of the syntax here, but the general idea seems to be:

  • If left censored -> go with cumulative_density
  • If not left censored -> go with survival function

But that doesn't seem to be how the code operates today based on #1181.

  1. Do we want kmf.plot() to automatically do a CDF for left censored fits? I don't think so based on discussion in #1180.
  2. Is the above code doing what it's supposed to? 🤔

GrowthJeff avatar Dec 04 '20 16:12 GrowthJeff

I actually apologize - previous Cameron way over-engineered this class and made us all confused today 🙃

Anyways, I think the correct behaviour is as follows:

  • plot should probably always return the SF. This aligns with the other estimators being consistent between censoring types.
  • plot is perhaps a bad method: the other methods (plot_survival_function, plot_cumulative_hazard,...) are more descriptive and have 0 ambiguity. plot doesn't different things for different models! Yuck!

I may, in the future, drop plot altogether so no future user will be confused by issues like this.

CamDavidsonPilon avatar Dec 04 '20 18:12 CamDavidsonPilon