are-16-heads-really-better-than-1 RuntimeError: can't retain_grad on Tensor that has requires

Sorry to bother you. I met a bug druing runing the "heads_pruning.sh", and the error is：

12:21:27-INFO: ***** Running evaluation ***** 12:21:27-INFO: Num examples = 9815 12:21:27-INFO: Batch size = 32 Evaluating: 0% 0/307 [00:00<?, ?it/s]Traceback (most recent call last): File "pytorch-pretrained-BERT/examples/run_classifier.py", line 585, in main() File "pytorch-pretrained-BERT/examples/run_classifier.py", line 521, in main scorer=processor.scorer, File "/content/drive/My Drive/XAI in NLP/pytorch-pretrained-BERT/examples/classifier_eval.py", line 78, in evaluate input_ids, segment_ids, input_mask, label_ids) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/content/drive/My Drive/XAI in NLP/pytorch-pretrained-BERT/pytorch_pretrained_bert/modeling.py", line 1072, in forward output_all_encoded_layers=False, return_att=return_att) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/content/drive/My Drive/XAI in NLP/pytorch-pretrained-BERT/pytorch_pretrained_bert/modeling.py", line 769, in forward output_all_encoded_layers=output_all_encoded_layers) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/content/drive/My Drive/XAI in NLP/pytorch-pretrained-BERT/pytorch_pretrained_bert/modeling.py", line 458, in forward hidden_states, attn = layer_module(hidden_states, attention_mask) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/content/drive/My Drive/XAI in NLP/pytorch-pretrained-BERT/pytorch_pretrained_bert/modeling.py", line 441, in forward attention_output, attn = self.attention(hidden_states, attention_mask) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/content/drive/My Drive/XAI in NLP/pytorch-pretrained-BERT/pytorch_pretrained_bert/modeling.py", line 335, in forward self_output, attn = self.self(input_tensor, attention_mask) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/content/drive/My Drive/XAI in NLP/pytorch-pretrained-BERT/pytorch_pretrained_bert/modeling.py", line 307, in forward self.context_layer_val.retain_grad() File "/usr/local/lib/python3.6/dist-packages/torch/tensor.py", line 326, in retain_grad raise RuntimeError("can't retain_grad on Tensor that has requires_grad=False") RuntimeError: can't retain_grad on Tensor that has requires_grad=False Evaluating: 0% 0/307 [00:00<?, ?it/s]

I don't know how to fix it. Hope you can help me!

Feb 22 '21 12:02 YJiangcm

Hi. This error occurs due to using both retain_grad() and with torch.no_grad(). Yet, we need grad value only in calculate_head_importance function (grad_ctx = ctx.grad). except the function calculate_head_importance, by deactivating self.context_layer_val.retain_grad(), you can fix it.

Dec 03 '21 08:12 HayeonLee

Adding if context_layer.requires_grad == True: on 309 lines of modeling.py works for me. Like: if context_layer.requires_grad == True: self.context_layer_val.retain_grad()

Hope it solves your problem.

Mar 30 '23 11:03 xiyiyia

are-16-heads-really-better-than-1 are-16-heads-really-better-than-1 copied to clipboard

RuntimeError: can't retain_grad on Tensor that has requires_grad=False

are-16-heads-really-better-than-1
are-16-heads-really-better-than-1 copied to clipboard