Large-scale-Cloze-Test-Dataset-Created-by-Teachers icon indicating copy to clipboard operation
Large-scale-Cloze-Test-Dataset-Created-by-Teachers copied to clipboard

Is the performance in the paper based on all the data or the 3000 sampled questions?

Open Deep1994 opened this issue 7 years ago • 1 comments

Is the performance in the paper based on all the data or the 3000 sampled questions? If it is the latter, how can I get the same 3000 sampled data as you do for a fair comparison? Thank you!

Deep1994 avatar Dec 04 '18 05:12 Deep1994

Hi, only the human performance is based on the 3000 sampled questions. All the models‘ performance is measured on the whole test set.

michaelpulsewidth avatar Dec 04 '18 21:12 michaelpulsewidth