lm-evaluation-harness icon indicating copy to clipboard operation
lm-evaluation-harness copied to clipboard

How to calculate the "token_perplexity"

Open nongfang55 opened this issue 1 year ago • 0 comments

Now in the harness, there are metrics called "byte_perplexity" and "word_perplexity". These two metrics normalize the perplexity by the length of characters and words, respectively. If we want to normalize the perplexity by tokens (i.e., the length of tokens as cut by the tokenizer), how can we calculate it effectively?

I tried modifying the function process_results in the ConfigurableTask class, but I found it difficult to obtain the token length of the target. Does anyone have a good idea on how to calculate it?

nongfang55 avatar Aug 02 '24 09:08 nongfang55