bert-as-language-model
bert-as-language-model copied to clipboard
Probability of the last word is always too small
Hi, after seeing the result of predicting several Chinese phrases, I found that the probability of the last word is always too small compared to other words in the same phrase. This also happens in all examples shown in your readme.md. Therefore, the perplexity of phrases become also very high.
What do you think about this phenomenon? Thanks for your attention.
你试试在句子最后加一个句号...
确实是有这个问题