bilm-tf [Question] One Sentence Perplexity Computation Recommendation/Approach

So this might be already answered, and I mean this might be a novice question.

But I simply wanted to compute the perplexity of a single sentence, what parameters would I change to compute the score.

My current approach: I still use .bin/run_test.py However, I redirect the --test_prefix to point to a new folder with only 1 file, with one sentence in it. I have to add the batch_size flag to roughly 1 (i.e. --batch_size 1), which computes a batch_perplexities and respectively they're avg_perplexity.

I can't help but can an itch and wonder, about the unroll_step and if I'm truely computing the perplexity of the sentence in the file.

Let me know if what I've done is accurate, or if I need to change something to properly reflect 1 sentence perplexity computation.

I've attached a run with my current model and output of my current options configurations :)

Dec 03 '18 17:12 edgaruribe369

This would work except for the pesky issue of the statefulness of the LSTM states. The perplexities for the first batch or two are artificially high until the model has processed a few sentences (see https://github.com/allenai/bilm-tf#why-do-i-get-slightly-different-embeddings-if-i-run-the-same-text-through-the-pre-trained-model-twice). The code in run_test makes an implicit assumption that the size of the test set is very large so that the first batch perplexity doesn't have much impact on the overall average.

In the single sentence setting, or when you really want an accurate perplexity for the first batch, it's necessary to run inference with the first sentence 1-2 times first, then once more to calculate the loss. Subsequent batches can just be processed once as usual.

Dec 03 '18 19:12 matt-peters

This would work except for the pesky issue of the statefulness of the LSTM states. The perplexities for the first batch or two are artificially high until the model has processed a few sentences (see https://github.com/allenai/bilm-tf#why-do-i-get-slightly-different-embeddings-if-i-run-the-same-text-through-the-pre-trained-model-twice). The code in run_test makes an implicit assumption that the size of the test set is very large so that the first batch perplexity doesn't have much impact on the overall average.

In the single sentence setting, or when you really want an accurate perplexity for the first batch, it's necessary to run inference with the first sentence 1-2 times first, then once more to calculate the loss. Subsequent batches can just be processed once as usual.

Thanks for the solution. Would you implement a simple API for calculating the one sentence ppl?

Jan 08 '19 18:01 Andy-jqa

I don't have any plans to implement it, PRs welcome.

Jan 08 '19 22:01 matt-peters

Have the same question here. Just separate the dataset file into many single sentence file.

Feb 01 '19 07:02 Shuailong

any easy way the library can support to compute perplexity directly from ELMo hdf5 weight file ?

I lost checkpoint files and only have the fine-tuned hdf5 weights dumped before. Thanks

Mar 14 '19 13:03 jerrygaoLondon

How can i get perplexity for many sentences? Splitting sentences to files each containning one would be impossible when you have millions sentences. Should I train another language model on top of ELMo embedding?

Jul 17 '19 08:07 BigBorg

add batch_losses to append losses can get one batch sentences ppl

Sep 10 '20 01:09 demeiyan

bilm-tf bilm-tf copied to clipboard

[Question] One Sentence Perplexity Computation Recommendation/Approach

bilm-tf
bilm-tf copied to clipboard