Truthfulness evaluation

Open saattrupdan opened this issue 1 year ago • 1 comments

This will add an orthogonal evaluation of decoder language models, testing how much they hallucinate.

Jan 20 '25 17:01 saattrupdan

This dataset might be relevant: https://openai.com/index/introducing-simpleqa/

Jan 20 '25 17:01 saattrupdan