lm-evaluation-harness icon indicating copy to clipboard operation
lm-evaluation-harness copied to clipboard

Add new benchmark: Basque bench

Open zxcvuser opened this issue 1 year ago • 2 comments

BasqueBench is a benchmark for tasks in Basque that cover several evaluation areas. The datasets consist of professional translations of relevant English datasets and newly created datasets in Basque. The README.md contains detailed information on all the tasks included in the benchmark.

zxcvuser avatar Jul 30 '24 09:07 zxcvuser

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Jul 30 '24 09:07 CLAassistant

Thanks very much for this PR. Just some small issues I identified and if you could also run

pre-commit run --all-files

to fix the linting issues

baberabb avatar Sep 18 '24 16:09 baberabb

These are the changes done:

  • Added the benchmark info in lm_eval/tasks/README.md
  • Replaced "-" by "_" in the create_files script in flores_eu and added weight_by_size: false
  • Run linters
  • Remove grouping in mgsm and copa tasks (they were pointing to pre-existing benchmarks) With these, it should all be fine now. Thank you!

zxcvuser avatar Sep 27 '24 15:09 zxcvuser