MQuAKE
MQuAKE copied to clipboard
Small mistake in the sample dataset record?
Hi there! I noticed that the sample data record in the README might contain a mistake -- just wanted to ask for clarification in case I'm misunderstanding something.
The questions for Case 2500 ask for the capital of the country in which the CEO of Triple H holds citizenship. Afaik, Triple H is a person, not an organization? So perhaps it should read "the CEO of the employer of Triple H" instead?
Seems like this might be an issue with GPT-3.5, which generated the multi-hop questions. Is that correct?
Also, for case_id == 2, I see this:
[
'Which writer\'s country of citizenship is the same as the author of "Misery"?',
'What country does the author of "Misery" and another writer share their citizenship?',
'What is the nationality of the author of "Misery"?'
]
But it seems like the question should really just be about the nationality of the author of "Misery".
Let me know if there's anything I'm missing!
Hi @kmeng01! Thanks for pointing out! Yes, GPT-3.5 seems not to work well on case 2500 -- I plan to replace it with another example in README.
We also found that some generated multi-hop questions can be hard to interpret (e.g., first two questions in case 2); so we generate three questions and check the correctness of predictions in an "or" manner (i.e., whether models can understand and answer any of these questions).
Hope this helps!
Got it, thanks!