language-model-evaluation topic

List language-model-evaluation repositories

LMChallenge

61
Stars
18
Forks
Watchers

A library & tools to evaluate predictive language models.

PrivacyLens

39
Stars
8
Forks
39
Watchers

A data construction and evaluation framework to quantify privacy norm awareness of language models (LMs) and emerging privacy risk of LM agents. (NeurIPS 2024 D&B)