bilm-tf icon indicating copy to clipboard operation
bilm-tf copied to clipboard

[Question] One Sentence Perplexity Computation Recommendation/Approach

Open edgaruribe369 opened this issue 6 years ago • 7 comments

So this might be already answered, and I mean this might be a novice question.

But I simply wanted to compute the perplexity of a single sentence, what parameters would I change to compute the score.

My current approach: I still use .bin/run_test.py However, I redirect the --test_prefix to point to a new folder with only 1 file, with one sentence in it. I have to add the batch_size flag to roughly 1 (i.e. --batch_size 1), which computes a batch_perplexities and respectively they're avg_perplexity.

I can't help but can an itch and wonder, about the unroll_step and if I'm truely computing the perplexity of the sentence in the file.

Let me know if what I've done is accurate, or if I need to change something to properly reflect 1 sentence perplexity computation.

I've attached a run with my current model and output of my current options configurations :)

screen shot 2018-12-03 at 9 30 43 am

edgaruribe369 avatar Dec 03 '18 17:12 edgaruribe369

This would work except for the pesky issue of the statefulness of the LSTM states. The perplexities for the first batch or two are artificially high until the model has processed a few sentences (see https://github.com/allenai/bilm-tf#why-do-i-get-slightly-different-embeddings-if-i-run-the-same-text-through-the-pre-trained-model-twice). The code in run_test makes an implicit assumption that the size of the test set is very large so that the first batch perplexity doesn't have much impact on the overall average.

In the single sentence setting, or when you really want an accurate perplexity for the first batch, it's necessary to run inference with the first sentence 1-2 times first, then once more to calculate the loss. Subsequent batches can just be processed once as usual.

matt-peters avatar Dec 03 '18 19:12 matt-peters

This would work except for the pesky issue of the statefulness of the LSTM states. The perplexities for the first batch or two are artificially high until the model has processed a few sentences (see https://github.com/allenai/bilm-tf#why-do-i-get-slightly-different-embeddings-if-i-run-the-same-text-through-the-pre-trained-model-twice). The code in run_test makes an implicit assumption that the size of the test set is very large so that the first batch perplexity doesn't have much impact on the overall average.

In the single sentence setting, or when you really want an accurate perplexity for the first batch, it's necessary to run inference with the first sentence 1-2 times first, then once more to calculate the loss. Subsequent batches can just be processed once as usual.

Thanks for the solution. Would you implement a simple API for calculating the one sentence ppl?

Andy-jqa avatar Jan 08 '19 18:01 Andy-jqa

I don't have any plans to implement it, PRs welcome.

matt-peters avatar Jan 08 '19 22:01 matt-peters

Have the same question here. Just separate the dataset file into many single sentence file.

Shuailong avatar Feb 01 '19 07:02 Shuailong

any easy way the library can support to compute perplexity directly from ELMo hdf5 weight file ?

I lost checkpoint files and only have the fine-tuned hdf5 weights dumped before. Thanks

jerrygaoLondon avatar Mar 14 '19 13:03 jerrygaoLondon

How can i get perplexity for many sentences? Splitting sentences to files each containning one would be impossible when you have millions sentences. Should I train another language model on top of ELMo embedding?

BigBorg avatar Jul 17 '19 08:07 BigBorg

image add batch_losses to append losses can get one batch sentences ppl

demeiyan avatar Sep 10 '20 01:09 demeiyan