Should attack success rate be computed only on true positives?

Open longthangvu opened this issue 8 months ago • 0 comments

Hi, thanks for sharing this great repo and all the attack implementations!

I had a question regarding evaluation: if the goal is to compare attack effectiveness across different victim models, shouldn't we compute success rate based only on true positives (i.e., benign samples the model originally classified correctly)? Since the number of true positives can differ between models, evaluating attack success over all samples might be misleading. In my opinion, success percentage would be more truthful when measured over the originally correct predictions, since the adversarial samples would actually flip the decision in that case.

Would love to hear your thoughts!

May 01 '25 04:05 longthangvu