s2e-coref icon indicating copy to clipboard operation
s2e-coref copied to clipboard

Vanishing gradients?

Open Twim17 opened this issue 1 year ago • 1 comments

Hi, I was testing this model and during training i noticed that really quickly the training loss goes to zero and then it becomes unstable (staying at zero for most of the time, jumping to higher values the rest of the times). So I investigated a little with wandb to look at the gradients and it seems to me that there could be vanishing gradients. So my question is, did you actually saw if you had vanishing gradients? Did you also had such unstable loss (at least in the first epochs)?

Twim17 avatar Sep 14 '23 14:09 Twim17

Anybody?

Twim17 avatar Sep 19 '23 13:09 Twim17