pycox icon indicating copy to clipboard operation
pycox copied to clipboard

no loss for training with small batch size,

Open yikuanli opened this issue 2 years ago • 5 comments

When I train the Deepsurv model with small batch size (64), there is no loss and always early stop

WX20220127-103418

However, if I train with larger batch size (1024), it almost always training properly WX20220127-103508

May I ask what might be the possible reason ? I am not familiar with the progress bar

yikuanli avatar Jan 27 '22 10:01 yikuanli

That is very strange. I would always think a loss should be produced. Can you post a full example that reproduce the issue?

havakv avatar Jan 29 '22 12:01 havakv

That is very strange. I would always think a loss should be produced. Can you post a full example that reproduce the issue?

thanks for the quick response, I have identified the issue, which is caused by having zero positive event when batch size is small. Therefore, dividing the number of positive events can explode the loss. I have added a small number to the denominator to avoid this in my local branch

yikuanli avatar Jan 29 '22 13:01 yikuanli

@yikuanli can you share your solution here?

juancq avatar Jun 06 '22 03:06 juancq

Hi @yikuanli , I also want to know.

Minxiangliu avatar Jul 15 '22 04:07 Minxiangliu