lm-evaluation-harness Add new benchmark: Basque bench

BasqueBench is a benchmark for tasks in Basque that cover several evaluation areas. The datasets consist of professional translations of relevant English datasets and newly created datasets in Basque. The README.md contains detailed information on all the tasks included in the benchmark.

Jul 30 '24 09:07 zxcvuser

All committers have signed the CLA.

Jul 30 '24 09:07 CLAassistant

Thanks very much for this PR. Just some small issues I identified and if you could also run

pre-commit run --all-files

to fix the linting issues

Sep 18 '24 16:09 baberabb

These are the changes done:

Added the benchmark info in lm_eval/tasks/README.md
Replaced "-" by "_" in the create_files script in flores_eu and added weight_by_size: false
Run linters
Remove grouping in mgsm and copa tasks (they were pointing to pre-existing benchmarks) With these, it should all be fine now. Thank you!

Sep 27 '24 15:09 zxcvuser