MQuAKE icon indicating copy to clipboard operation
MQuAKE copied to clipboard

Small mistake in the sample dataset record?

Open kmeng01 opened this issue 1 year ago • 3 comments

Hi there! I noticed that the sample data record in the README might contain a mistake -- just wanted to ask for clarification in case I'm misunderstanding something.

The questions for Case 2500 ask for the capital of the country in which the CEO of Triple H holds citizenship. Afaik, Triple H is a person, not an organization? So perhaps it should read "the CEO of the employer of Triple H" instead?

Seems like this might be an issue with GPT-3.5, which generated the multi-hop questions. Is that correct?

kmeng01 avatar May 03 '24 17:05 kmeng01

Also, for case_id == 2, I see this:

[
  'Which writer\'s country of citizenship is the same as the author of "Misery"?',
  'What country does the author of "Misery" and another writer share their citizenship?',
  'What is the nationality of the author of "Misery"?'
]

But it seems like the question should really just be about the nationality of the author of "Misery".

Let me know if there's anything I'm missing!

kmeng01 avatar May 04 '24 08:05 kmeng01

Hi @kmeng01! Thanks for pointing out! Yes, GPT-3.5 seems not to work well on case 2500 -- I plan to replace it with another example in README.

We also found that some generated multi-hop questions can be hard to interpret (e.g., first two questions in case 2); so we generate three questions and check the correctness of predictions in an "or" manner (i.e., whether models can understand and answer any of these questions).

Hope this helps!

a3616001 avatar May 05 '24 03:05 a3616001

Got it, thanks!

kmeng01 avatar May 05 '24 09:05 kmeng01