censored
censored copied to clipboard
Refactor functions to calculate survival probability
The predict()
function has an argument time
which captures the times at which to predict the survival probability. The functions to do the calculations also have arguments or objects time
and times
which sometimes also refer to the event times in the data. They'd be easier to read if the destinction between those different times would be clearer, e.g., via their name.
check if floor_surv_mboost()
could use some of the functionality for cutting the KM estimates a la survival_prob_cph()
so that we use the same logic everywhere
and survival probabilities for bag_tree(engine = "rpart")
The predict() function has an argument time which captures the times at which to predict the survival probability. The functions to do the calculations also have arguments or objects time and times which sometimes also refer to the event times in the data. They'd be easier to read if the destinction between those different times would be clearer, e.g., via their name.
I agree with you and also think the term times
for survival prediction also makes it easy to misinterpret what is being predicted. For example, it would be reasonable to think we are predicting survival probability at 10 years if times = 10 years
(i.e., probability of surviving year 10 given you survived up to year 10), but the thing being predicted is the probability of surviving up to and including this time.
Do you think using a term like time_horizon
or pred_horizon
in predict()
would help? It may clarify that this time input indicates the end of a period where survival probability is estimated and could also help differentiate this input from the time
input used for event times.
I think time
as an argument name for predict()
works.
There is no other argument to predict()
that would refer to the event times so there isn't an opportunity for confusion there. (I was referring to objects inside of internal helper functions when I opened this issue.)
Regarding the topic of "at time point t" vs "up and including time point t": I think it's okay to lean on the standard definition of survival probability S(t) and call the argument for t time
.
I was referring to objects inside of internal helper functions when I opened this issue.
That makes sense! Sorry, I misinterpreted your thought above - I think I understand what you mean now.
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.