lm-evaluation-harness RACE dataset?

RACE dataset?

Open Davido111200 opened this issue 10 months ago • 0 comments

Hi,

I have a quick question about the RACE dataset. It appears that the code is evaluating on the EleutherAI/RACE dataset, which contains approximately 1000 examples. However, the original RACE dataset consists of two subsets: "high" and "middle". Each has about 4k examples. I noticed that in the dataset card this is the test set for "high" subset. Can you explain why there is a mismatch here? Thanks!

Apr 15 '24 12:04 Davido111200

lm-evaluation-harness lm-evaluation-harness copied to clipboard

RACE dataset?

lm-evaluation-harness
lm-evaluation-harness copied to clipboard