MPQG
MPQG copied to clipboard
Statistics of released dataset splits
Hey @freesunshine0316,
Would you please confirm the stats of train/dev/test in your experiment? With the released data, I am getting: ~ 75500/17934/11805 Is this correct?
Thanks
I release the data I used, sorry I'm busy on something else. I assume yours are correct
I have the same question, the train/dev/test split in your github is 71500/16758/11805, which is different in your paper(split1: 70,484/10,570/11,877, split2: 86,635/8,965/8,964). if convenient, could you please share the train/dev/test split code or data? @deepaknlp @freesunshine0316
@freesunshine0316 Split-2 link is here: https://res.qyzhou.me/redistribute.zip
Hi @Fengfeng1024
"Split-2" (released by Zhou et al.) exactly match the statistics and @deepaknlp just shared the link. "Split-1" was originally released by Du et al., which we can't directly use as there is no information on the answer positions. As a result, we use their provided doclist-xxx.txt to generate our own data (provided along this repository). But we mistakenly report their train/dev/test split in our paper.
thank you so much for your help.