promptsource icon indicating copy to clipboard operation
promptsource copied to clipboard

Added original template prompt to LAMA-TREx

Open JanKalo opened this issue 3 years ago • 8 comments

Added a prompt to LAMA-TREx which is closer to the original LAMA template prompts.

JanKalo avatar Apr 26 '22 13:04 JanKalo

See Jess's comments in #737 for more suggestions on prompts that sound more natural to an English speaker with no experience in NLP (i.e., write prompts as if you are talking to a college student who knows nothing about computer science, avoid prompts with technical jargon as if you’re talking to other NLP researchers.) For example, you wrote:

What is the missing word to fill the [MASK]?

How about "What could be a missing word at [MASK]?"

Also, you only included two original task prompts. We want at least 5 original task prompts. Thanks!

Sorry, was not aware of the min 5. requirement. I added 6 prompts and also adapted the other one according to your recommendation.

JanKalo avatar Apr 28 '22 11:04 JanKalo

Marked the task as original task and added 5 prompts per task. However, I have some problems in solving the problem with the automatic testing.

JanKalo avatar May 09 '22 20:05 JanKalo

Thanks! I rebased your branch to the latest eval-hackathon branch. The prompts themselves look good. Just some housekeeping questions left:

  1. Why are your “question prompts” not original task? Don't they also test for the same knowledge as the fill-in-the-MASK format?
  2. Since you're now just using janck/bigscience-lama as opposed to lama/trex, could you remove the prompts in the latter? That might fix the automatic tests.
  3. Be aware that the question prompts do yield some ungrammatical sentences like the ones in the screenshots below. They're probably okay since the the questions are part of the dataset, and that this question format are still more natural-sounding than the fill-in-the-MASK format.
スクリーンショット 2022-05-12 14 46 52 スクリーンショット 2022-05-12 14 47 51

awebson avatar May 12 '22 18:05 awebson

@awebson Thanks for rebasing.

  1. I can change them to the original task. The questions are not part of the original LAMA benchmark though. I created them by translating the original LAMA prompt templates to questions.
  2. I am not 100% sure what exactly you are proposing. Do you think we should delete the complete LAMA/TREx task? Or should I just delete the prompt from this PR?
  3. Since I created bigscience-lama, I will correct these question templates in the dataset, so that they are grammatically correct.

JanKalo avatar May 13 '22 09:05 JanKalo

  1. Marking them as original task would be great. Thanks! Few prompts are part of their original datasets, as most datasets predate prompt-based models. The original task flag indicates if a prompt reflects what the dataset intends to measure, not the exact phrasing of instructions used by the creation of the datasets.
  2. We should remove the LAMA/TREx prompts from this PR, since I assume you will report the results solely from your bigscience-lama? (We cannot remove the LAMA/TREx dataset, which is maintained by the HF datasets team.)
  3. That'd be ideal. Thanks so much!

awebson avatar May 14 '22 02:05 awebson

@JanKalo this has merge conflicts and build errors, can you merge in changes from main and see if that fixes the build errors?

jzf2101 avatar May 26 '22 18:05 jzf2101

@jzf2101 Thanks. Yes. I will take care of this today. I was a bit busy the previous days.

JanKalo avatar May 27 '22 08:05 JanKalo

@jzf2101 Thanks. Yes. I will take care of this today. I was a bit busy the previous days.

Bump! Thanks!

jzf2101 avatar Jun 27 '22 04:06 jzf2101