allennlp icon indicating copy to clipboard operation
allennlp copied to clipboard

MLM in AllenNLP?

Open ethch18 opened this issue 3 years ago • 4 comments

Hi! I saw that there's currently a masked language modeling Model over in the allennlp-models repo, but it looks like most of the components are marked as demo-only. Is full functionality for this on the roadmap? If not, is there an estimate of what it would take to get it working?

ethch18 avatar Apr 02 '21 04:04 ethch18

Hey @ethch18, this model was developed without the intent of making it efficient for training. And it's not on our road-map to improve this class, but we would certainly appreciate contributions.

I'm not sure how much work this would take, as this implementation is pretty old at this point and I'm not very familiar with it. But I'm happy to try to answer specific questions that arise.

epwalsh avatar Apr 09 '21 19:04 epwalsh

Thanks @epwalsh, I'll think about this more. I won't be able to work on it immediately, but I can leave the issue open in case others want to take on it

ethch18 avatar Apr 12 '21 16:04 ethch18

@ethch18 I implemented MLM in AllenNLP for my own project. Unfortunately its highly coupled to some other code, but these might be useful:

  • Helper functions (mostly based on HF transformers) for BERT-like masking: https://github.com/JohnGiorgi/DeCLUTR/blob/master/declutr/common/masked_lm_utils.py
  • MLM enabled text field embedder: https://github.com/JohnGiorgi/DeCLUTR/blob/master/declutr/modules/text_field_embedders/mlm_text_field_embedder.py
  • MLM enabled pretrained transformer embedder: https://github.com/JohnGiorgi/DeCLUTR/blob/master/declutr/modules/token_embedders/pretrained_transformer_embedder_mlm.py
  • The model. Again this is highly coupled to other stuff, but if you keyword search "mask", "masked" and "masked_lm_loss" you will see the MLM relevant code: https://github.com/JohnGiorgi/DeCLUTR/blob/master/declutr/model.py

Hope that helps!

JohnGiorgi avatar May 01 '21 15:05 JohnGiorgi

@JohnGiorgi thank you!

ethch18 avatar May 01 '21 19:05 ethch18