lm-evaluation-harness
lm-evaluation-harness copied to clipboard
Added TurkishMMLU to LM Evaluation Harness
In this pull request, I would like to add our work TurkishMMLU: Measuring Massive Multitask Language Understanding in Turkish to LM Evaluation Harness. You can find the details of our work in our repository: https://github.com/ArdaYueksel/TurkishMMLU Also, our dataset is made available in HuggingFace: https://huggingface.co/datasets/AYueksel/TurkishMMLU
Key Features:
- MMLU variant in Turkish Language
- Has a separate Development Set.
- Chain-of-Thought Configuration is available.
Test failures unrelated.