Large-scale-Cloze-Test-Dataset-Created-by-Teachers Is the performance in the paper based on all the data or the 3000 sampled questions?

Is the performance in the paper based on all the data or the 3000 sampled questions?

Open Deep1994 opened this issue 7 years ago • 1 comments

Is the performance in the paper based on all the data or the 3000 sampled questions? If it is the latter, how can I get the same 3000 sampled data as you do for a fair comparison? Thank you!

Dec 04 '18 05:12 Deep1994

Hi, only the human performance is based on the 3000 sampled questions. All the models‘ performance is measured on the whole test set.

Dec 04 '18 21:12 michaelpulsewidth

Large-scale-Cloze-Test-Dataset-Created-by-Teachers Large-scale-Cloze-Test-Dataset-Created-by-Teachers copied to clipboard

Is the performance in the paper based on all the data or the 3000 sampled questions?

Large-scale-Cloze-Test-Dataset-Created-by-Teachers
Large-scale-Cloze-Test-Dataset-Created-by-Teachers copied to clipboard