GradNorm icon indicating copy to clipboard operation
GradNorm copied to clipboard

This in my Demo of Chen et al. "GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks" ICML 2018

Results 5 GradNorm issues
Sort by recently updated
recently updated
newest added

Hi, I use the GardNorm in my segmentation and classification task. I want to use the DistributedDataParallel to train it. But it occurs the error: "RuntimeError: derivative for batch_norm_backward_elemt is...

Hi author I refer to your code which include GradNorm part, and rewrite for my own transformer based model training. Everything is good, but when the iteration growth up, the...

![image](https://user-images.githubusercontent.com/24848148/163394875-8f541169-8510-4755-9b14-15d0c03bae5b.png) the l1 and l2 is not with the w

Hi, I am a bit confused about the update process of `w`. In the paper, only the sum of `w` is constrained to be `task_num`, but it is not avoided...

File "main_simmim_pt.py", line 302, in train_one_epoch G1R = torch.autograd.grad(L1, param[0].clone(), retain_graph=True, create_graph=True) File "D:\txj\envs\swin2\lib\site-packages\torch\autograd\__init__.py", line 236, in grad inputs, allow_unused, accumulate_grad=False) RuntimeError: One of the differentiated Tensors appears to not...