allennlp icon indicating copy to clipboard operation
allennlp copied to clipboard

Set Layers that are freezed to eval mode in BERT during training

Open successar opened this issue 5 years ago • 2 comments

Hi

I was wondering if there was a way to turn the dropout and layer-norm layers in BERT to eval mode during training when we set the requires_grad parameter to False for PretrainedBert here - https://github.com/allenai/allennlp/blob/99125490e1e82e95c99792b0873309f93f706ec0/allennlp/modules/token_embedders/bert_token_embedder.py#L294 ?

The problem I see is that the allennlp trainer loop call model.train() on the whole model and will turn back the BERT to train mode even if I modify the initialization code above to set to eval model.

My current setup is to set the layers to eval mode during forward call of the model . Is there a better way ?

https://discourse.allennlp.org/t/bert-set-to-eval-mode-when-requires-grad-false/103/2?u=sarthak_jain

successar avatar Nov 02 '19 17:11 successar

Based on the discourse thread, I'll set this to Contributions Welcome.

kernelmachine avatar Nov 05 '19 21:11 kernelmachine

Putting in my comment from the discourse thread:

Hmm, sounds like we’d want to override model.train() to handle this properly. It also sounds a bit messy to get correct for everything, but if you can think of a clean solution, I think this is definitely a problem that we’d want to fix in the library. Feel free to open an issue about this in the repo, and I’ll mark it as “contributions welcome”.

You should be able to override train() on your own model class, also. That would be a good way to test this to see if it’s possible to do it in a clean way that will generalize to other models. If you can, then a PR to add it to the base model class would be lovely.

matt-gardner avatar Nov 08 '19 03:11 matt-gardner