lm-evaluation topic
List
lm-evaluation repositories
latxa
31
Stars
0
Forks
31
Watchers
Latxa: An Open Language Model and Evaluation Suite for Basque
xFinder
181
Stars
7
Forks
181
Watchers
[ICLR 2025] xFinder: Large Language Models as Automated Evaluators for Reliable Evaluation
RAG-evaluation-harnesses
23
Stars
2
Forks
23
Watchers
An evaluation suite for Retrieval-Augmented Generation (RAG).
CiteME
48
Stars
5
Forks
48
Watchers
CiteME is a benchmark designed to test the abilities of language models in finding papers that are cited in scientific texts.