censored icon indicating copy to clipboard operation
censored copied to clipboard

Refactor functions to calculate survival probability

Open hfrick opened this issue 2 years ago • 5 comments

The predict() function has an argument time which captures the times at which to predict the survival probability. The functions to do the calculations also have arguments or objects time and times which sometimes also refer to the event times in the data. They'd be easier to read if the destinction between those different times would be clearer, e.g., via their name.

hfrick avatar Feb 01 '22 14:02 hfrick

check if floor_surv_mboost() could use some of the functionality for cutting the KM estimates a la survival_prob_cph() so that we use the same logic everywhere

hfrick avatar Feb 07 '22 16:02 hfrick

and survival probabilities for bag_tree(engine = "rpart")

hfrick avatar Feb 10 '22 18:02 hfrick

The predict() function has an argument time which captures the times at which to predict the survival probability. The functions to do the calculations also have arguments or objects time and times which sometimes also refer to the event times in the data. They'd be easier to read if the destinction between those different times would be clearer, e.g., via their name.

I agree with you and also think the term times for survival prediction also makes it easy to misinterpret what is being predicted. For example, it would be reasonable to think we are predicting survival probability at 10 years if times = 10 years (i.e., probability of surviving year 10 given you survived up to year 10), but the thing being predicted is the probability of surviving up to and including this time.

Do you think using a term like time_horizon or pred_horizon in predict() would help? It may clarify that this time input indicates the end of a period where survival probability is estimated and could also help differentiate this input from the time input used for event times.

bcjaeger avatar Aug 28 '22 18:08 bcjaeger

I think time as an argument name for predict() works.

There is no other argument to predict() that would refer to the event times so there isn't an opportunity for confusion there. (I was referring to objects inside of internal helper functions when I opened this issue.)

Regarding the topic of "at time point t" vs "up and including time point t": I think it's okay to lean on the standard definition of survival probability S(t) and call the argument for t time.

hfrick avatar Sep 02 '22 13:09 hfrick

I was referring to objects inside of internal helper functions when I opened this issue.

That makes sense! Sorry, I misinterpreted your thought above - I think I understand what you mean now.

bcjaeger avatar Sep 02 '22 23:09 bcjaeger

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

github-actions[bot] avatar Oct 28 '22 01:10 github-actions[bot]