spaCy icon indicating copy to clipboard operation
spaCy copied to clipboard

Language Model Score (Perplexity score) of a sentence

Open mdasadul opened this issue 4 years ago • 3 comments

Hi Thansk for this nice work. @honnibal I am interested to use any of the pre-trained language model to calculate Perplexity score of a sentence. Is there any way to achieve that using this repo?

Thanks

mdasadul avatar Sep 09 '19 18:09 mdasadul

We don't have that functionality yet unfortunately. I hope we can provide it in a future release.

honnibal avatar Sep 26 '19 12:09 honnibal

Any updates on this?

zephyrzilla avatar Feb 23 '20 18:02 zephyrzilla

It's possible to compute perplexity by GPT-2, whereas Masked Language Models (MLM) such as BERT cannot compute it. In case of MLM, we should consider other metrics. I think pseudo-perplexities is one of the solution. https://arxiv.org/abs/1910.14659

tagucci avatar Feb 02 '21 13:02 tagucci