gradnorm-pytorch How to properly set grad_norm

How to properly set grad_norm_parameters ?

Open ekurtulus opened this issue 1 year ago • 0 comments

Let's assume that I have a single image feature extractor on top of which there are two linear classification heads. What should I set grad_norm_parameters in this case ? Is it the entire network?

Dec 25 '23 20:12 ekurtulus

gradnorm-pytorch gradnorm-pytorch copied to clipboard

How to properly set grad_norm_parameters ?

gradnorm-pytorch
gradnorm-pytorch copied to clipboard