ai-evaluation topic

List ai-evaluation repositories

vivaria

53
Stars
15
Forks
Watchers

Vivaria is METR's tool for running evaluations and conducting agent elicitation research.