TFDeepSurv icon indicating copy to clipboard operation
TFDeepSurv copied to clipboard

why `batch` is not appropriate in survival analysis?

Open bnuzyc91 opened this issue 4 years ago • 2 comments

@liupei101 in the dsl.py you have added this comment "Since style of batch is not appropriate in survival analysis."

Could you please advise why batch (mini-batch) training is not appropriate in survival analysis? I notice that the DeepSur (https://github.com/jaredleekatzman/DeepSurv ) also does not support the mini-batch training.

Based on my understanding, mini-batch should be suitable for any ML problem as long as we train enough epoch. Feel free to correct me if I misunderstand this.

Thank you so much!

bnuzyc91 avatar Feb 07 '21 10:02 bnuzyc91

To put it straightforwardly, if we dive into the loss function (partial likelihood function), survival analysis is a variant of ranking task.

Since we did not find any reference paper to support running batch training in survival analysis, the code remained as you saw.

Thx for your feedbacks! TFDeepSurv has implemented the pipeline in survival analysis, but we can easily find the shorts. We need to add functionality that supports:

  • GPU training
  • mini-batch training (experimentally) to speed up the training phase!

In the near future, above plans will not be implemented. Sorry about it!

PRs are welcomed!

liupei101 avatar Mar 05 '21 02:03 liupei101

Thank you so much for the explanation. Now I agree that mini-batch may not be a good idea for the survival analysis especially in the case of a rare event (the chance of developing the outcome is low).

bnuzyc91 avatar Mar 08 '21 03:03 bnuzyc91