tofu icon indicating copy to clipboard operation
tofu copied to clipboard

Which dataset should we use for evaluate?

Open Yuda-Jin opened this issue 5 months ago • 1 comments

which dataset config was used in leaderboard? Should I use forget10_perturbed or just forget10 or retain90? If I use forget10 dataset, how to set perturbed_answer_key and eval_task? image

Yuda-Jin avatar Sep 19 '24 04:09 Yuda-Jin