evaluation icon indicating copy to clipboard operation
evaluation copied to clipboard

Add HANS dataset

Open aakanksha19 opened this issue 3 years ago • 0 comments

  1. Evaluated on GPT2
  2. Time taken: 3:40:59 on GTX 1080 Ti

Other comments:

  1. Prompt template used is the same as XQUAD/PIAF, with minor addition of the question "is this true or false?" (to indicate entailment/non-entailment)
  2. In addition to accuracy, other fine-grained evaluation metrics present in the HANS evaluation script (https://github.com/tommccoy1/hans/blob/master/evaluate_heur_output.py) are also added, but can be removed if deemed unnecessary.

aakanksha19 avatar Oct 02 '21 14:10 aakanksha19