Pytorch-UNet Gradients are clipped before the unscaling

Gradients are clipped before the unscaling

Open marcovisentin opened this issue 1 year ago • 1 comments

At lines 114-115 in train.py. I believe 'scaler.unscale_(optimizer)' should be added before gradient clipping.

Nov 04 '23 14:11 marcovisentin

In my opinion, scaler.step(optimizer) include unscaleing and it do two things,first unscaling if you did't unscale manualy before.second,it will check if there exists overflows,if there are no NAN/INF,it will execute the optimizer's step,if there are,it will skip this iteration's parametes update.so if the gradients are clipped after the scaler.step,I think it make no sense.the gradients clip just aim to avoid gradient explosion，but if there exist gradinent explosion ,scale.step will skip this iteration's parametes update,abosolutely there in no need for clipping.

Jan 12 '24 11:01 tensorctn

Pytorch-UNet Pytorch-UNet copied to clipboard

Gradients are clipped before the unscaling

Pytorch-UNet
Pytorch-UNet copied to clipboard