evaluation icon indicating copy to clipboard operation
evaluation copied to clipboard

Code and Data for Evaluation WG

Results 50 evaluation issues
Sort by recently updated
recently updated
newest added

My attempt to add the ANLI dataset (issue #32), including: - Load ANLI and reformat each of the three validation splits (R1, R2, R3) into the prompt provided by the...

1. Evaluated on GPT2 2. Time taken: 3:40:59 on GTX 1080 Ti Other comments: 1. Prompt template used is the same as XQUAD/PIAF, with minor addition of the question "is...

(per question raised about [slide 6](https://docs.google.com/presentation/d/1LLWFR5AElafxDK4zu4pFdw8-Rz-UGvemG6xcu2uICjE/edit?usp=sharing) at the evaluation meeting on 9/1).

A simple proposal of using promptsource directly such that we don't have to implement it from scratch.

This might sounds like a bit of re-structuring but for the sake of future compatibility, I propose the following, 1. Move to `huggingface` trainer: This will help the repo to...