tofu
tofu copied to clipboard
eval generates answer same as dataset
I finetuned llama2 on the full dataset, ran gradient ascent on forget05, and then evaluated the unlearned model on forget05. Surprisingly, when I looked at the eval_log_forget.json file all I could see was that it generates the responses as it is in the dataset. For example, Question: What is the full name of the geology author born in Karachi, Pakistan on 06/30/1975? Answer: The author's name is Hina Ameen.
Generated Answer: The author's name is Hina Ameen.
Also, the p-value is substantially low (7.82e-19) Am I interpreting the evaluated results correctly?