unitxt
unitxt copied to clipboard
fixed mmmu by cooking options from answer, when options is not given in the instance
There are 30 cards in the group cards.mmmu.*, 16 of which (more than half) are erroneous: do not pass unitxt.api.load_dataset :
mmmu_main.pdf
Exploring the original HF datasets, the following came up: mmmu_observations.pdf
answer=="?" if and only if the instance is in splittest- the
mmmucard effectively discards splittest - there are 10500 instances on split
test, 900 invalidation(that the card takes to be its test split) and 150 indev(that the card takes to be its train split). - field
options(to become the choices) is empty in 53validationinstances and 9devinstances and 627testinstances. - if not empty,
optionsfield is of length > 1, reaching up to 9, andanswerindexes into it in the form of A,B,C.. - only when
optionsis empty, does fieldanswerhave an irregular value, that looks like the correct answer toquestion. The correct answer itself, and not the A B C to refer to it.
Therefore, the fix is as follows: for an instance with an empty options field, the card cooks an options field in the form of [answer], and then changes answer to read A.
This fixed all the errors: