language-model-evaluation topic
List
language-model-evaluation repositories
LMChallenge
61
Stars
18
Forks
Watchers
A library & tools to evaluate predictive language models.
PrivacyLens
39
Stars
8
Forks
39
Watchers
A data construction and evaluation framework to quantify privacy norm awareness of language models (LMs) and emerging privacy risk of LM agents. (NeurIPS 2024 D&B)