GSM icon indicating copy to clipboard operation
GSM copied to clipboard

Confused with the codes using grads clipping and accumulation simultaneously

Open 0HaNC opened this issue 4 years ago • 1 comments

Take args.iter_size==2 for example, I think the clipped and accumulated grads of your codes are clip(clip(grads1)+grads2), not clip(grads1+grads2), which makes more sense for me.

I haven't run the code yet, I just wonder whether this is a problem.

0HaNC avatar Sep 17 '20 08:09 0HaNC

The second case indeed makes more sense. However, I am not sure if it would make a significant impact on the final performance of the model. I will update the code with the second setting and have a run later.

swathikirans avatar Sep 21 '20 06:09 swathikirans