pytorch_pretrained_BERT
pytorch_pretrained_BERT copied to clipboard
Fine tuned bert LM
Hi,
I use pytorch_pretrained_BERT/examples/python run_lm_finetuning.py
to fit the model with monolingual set of sentences. I use bert multilingual cased model.
Once the model is fine-tuned, I get the loss for given sentences with the following code:
def get_score(sentence, model):
tokenize_input = tokenizer.tokenize(sentence)
tensor_input = torch.tensor([tokenizer.convert_tokens_to_ids(tokenize_input)])
model.eval()
predictions=model(tensor_input)
loss_fct = torch.nn.CrossEntropyLoss()
loss = loss_fct(predictions.squeeze(),tensor_input.squeeze()).data
return math.exp(loss)
sentence = "ﺶﻋﺮﺴﺗﺎﻧ؛ ﺩ پښﺕﻭ ﺶﻋﺭپﻮﻬﻧې ﻥﻭی پړﺍﻭ - ﺕﺎﻧﺩ"
tokenizer = BertTokenizer.from_pretrained('bert-base-multilingual-cased')
stats=torch.load('pytorch_model.bin')
bertMaskedLM = BertForMaskedLM.from_pretrained('bert-base-multilingual-cased', state_dict=stats)
print(get_score(sentence, bertMaskedLM))
78637.05198167797
bertMaskedLM_orig = BertForMaskedLM.from_pretrained('bert-base-multilingual-cased')
print(get_score(sentence, bertMaskedLM_orig))
7.919475431571431
The strange thing is that the fine-tuned model returns much higher loss scores, even if the evaluated sentence appeared in monolingual training data.
Is something I am doing wrong? I just want to check how well the given sentence fits into LM.
Regards and thanks in advance
I suggest to use https://github.com/huggingface/transformers
this repo is the copy of huggingface's project
pytorch_pretrained_BERT in huggingface change to transformers
see the original code in https://github.com/huggingface/transformers/tree/0.5.0
best regards