pet icon indicating copy to clipboard operation
pet copied to clipboard

PET results different from reported in huggingface blog "How many data points is a prompt worth?" study

Open luffycodes opened this issue 2 years ago • 1 comments

For MNLI, on the blog https://huggingface.co/blog/how_many_data_points/ - reported accuracy is 0.83 for 1000 data samples.

In the paper (https://arxiv.org/pdf/2001.07676.pdf), (table 1), for MNLI, accuracy reported is 0.85 for 1000 data samples.

I was wondering how the accuracy is reported in the PET paper.

luffycodes avatar Nov 22 '21 20:11 luffycodes

Hi @luffycodes, the accuracy reported in the PET paper is exactly what you obtain using this library. You can check out details about the "How many data points is a prompt worth?" study in their paper - one important difference to our experiments is that they

[...] run every experiment 4 times in order to reduce variance,

Also, I would assume that they have used a different random selection of 1,000 training examples (but to verify this, you should reach out to the authors directly).

timoschick avatar Dec 07 '21 17:12 timoschick