stAdv
stAdv copied to clipboard
About a problem of the demo
I replaced your demo's model by the paper's experiment model A in your demo,and test the model on a single image.But the produced adversarial example is different from the original image very much.The tau is 0.05.How should i do?Thank you so much!!!
I don't think that it looks that bad. Have a look at Fig. 2 in arXiv:1801.02612 (the original paper), the digits they have generated do not look very different from this one. (Also it depends somewhat on luck: for some digits the perturbation is more noticeable than for others.) But in this case you can try increasing tau (and stop when it no longer predicts a 3) to improve spatial smoothness.
I don't think that it looks that bad. Have a look at Fig. 2 in arXiv:1801.02612 (the original paper), the digits they have generated do not look very different from this one. (Also it depends somewhat on luck: for some digits the perturbation is more noticeable than for others.) But in this case you can try increasing tau (and stop when it no longer predicts a 3) to improve spatial smoothness.
the expriment is not over,but the result is clearly obversed that 779 images have only 272 adversarial images attacking successfully.
Ok,thank you for your answer.I have an new problem.I calculate the attack success rates of adversarial examples generated by model A.And the result is bad.The original paper show the attack success rate is 99.95%(from the tabel 1),which confused me a period of time.
What do you mean by "bad"? Do you find a very different attack success rate? Because the example you show above would count as a successful attack (it gets classified as the target, 3).
If it's about the ASR: do you really have the exact same model as model A? (architecture? (how about dropout, commented out in your code snippet?) training procedure?) And the same procedure for calculating the ASR?
Yes,i have the same model as model A.The dropout layers was added in the training procedure.When i calculate the ASR,i commented out them.Although the experiment is running,the result is bad.because only 272 images was attacked successfully.The current ASR is 272/779,which is almost 1/3 images attacked successfully.
Oh,i almost forgot.when i added softmax as an activation function in the FC layers (it is the same as the original paper's table 5),i absolutely not attacked any other images successfully.So i abandon softmax,and it is effective.
Yes,i have the same model as model A.The dropout layers was added in the training procedure.When i calculate the ASR,i commented out them.Although the experiment is running,the result is bad.because only 272 images was attacked successfully.The current ASR is 272/779,which is almost 1/3 images attacked successfully.
There are a number of checks you can make:
- check that you reproduce the accuracy stated in arXiv:1801.02612 on clean images. If not, you do not have the same model as model A.
- make sure you have version 0.2 the library. Version 0.1 was computing something different.
- investigate if the failure cases come from non-convergence or from convergence but predicting the initial label (1, in the example above). You can check the info dictionary returned by
stadv.optimization.lbfgs
. If you have NaN gradients try increasing theepsilon
parameter ofstadv.losses.flow_loss
to e.g. 1e-4.
Oh,i almost forgot.when i added softmax as an activation function in the FC layers (it is the same as the original paper's table 5),i absolutely not attacked any other images successfully.So i abandon softmax,and it is effective.
Yes, as done in https://github.com/rakutentech/stAdv/blob/master/demo/simple_mnist.ipynb you should feed the logits to the stAdv-related losses, and not the output of the softmax. But did that solve your problem or is it something you had already done?
Firstly thank you very much for your patience.
- the accuracy is 98.6% on clean images.the accuracy in tabel 1 is 98.58%.
- the library's version is 0.2
- the failure cases come from convergence and the gradients are normal.
- yes, i solve the problem with abandoning the softmax.
the picture show the lbfgs'result info from failure cases.
Thanks for the screenshot (indeed it looks sane) and glad you solved the problem! I am closing the issue.
nonononono,i am not solve this problem.the result's info are normal,but the ASR is only 1/3.It is too low.I really really hope get your help.
Ah, I thought that the issue was solved. Sorry I don't have time to look into it in more detail now but let me reopen the issue for later.
It is ok,thank you for your patience all the same.
I have tried to reproduce your results using the demo notebook (demo/simple_mnist.ipynb
), running over randomly picked test samples (and randomly picked target classes) in a loop to estimate the attack success rate (ruling out samples misclassified even with no perturbation). In this case, with tau=0.05, I obtain a success rate of about 72%. This is much larger than what you observe (but you are using a different CNN) but much smaller than what Xiao et al. report in arXiv:1801.02612 (they claim an ASR of 99.95% or more).
Here are possible reasons for this discrepancy:
- A difference in the computation of the ASR: I had another look at the Xiao et al. paper, and did not find a clear definition of it. I hope that they really compute the total fraction of attacks that succeed, and not the fraction of test images for which at least one targeted attack succeeds.
- A difference between what Xiao et al. used in their experiments and what they wrote in Section 3 of the paper. For example, you can notice that Eq. (4) p. 5 and Eq. (5) p. 14 are different (square root of sum vs. sum of square root; also the 1/n factor). What I have implemented in v0.2 of the library is Eq. (4) (what they claim they have used in the paper). Actually, in v0.1 of the library I had mistakenly implemented something closer to Eq. (5) for the flow loss and found higher ASR.
- A subtle difference related to the optimization with L-BFGS, but I would find it very surprising that it makes such a large difference.
- A bug in my code.
- A bug in their code.
Since Xiao et al. have not provided a public implementation of their attack, it is difficult to pinpoint the origin of the discrepancy. Also, by simple visualization the generated adversarial examples look quite good to me. For practical purposes, you may try to decrease tau until you have the desired ASR. Or, if you are mainly concerned with reproducing the results from the Xiao et al. paper, I suggest that you contact the authors. In any case, if you have any update on that please let us know.
Thank you so much!The way of calculating the ASR for me is the same as you(ruling out samples misclassified even with no perturbation).I decrease tau to 0.005,and model A's ASR can promote to about 76%,which is still much smaller than what Xiao et al. claim the ASR of 99.95% or more,and their tau is 0.05.I emailed the authors about this problem several days ago,and he does not reply me yet.I will continue to contact him.If i have any update on that,i will tell you.
Ok, thank you! On a more general note, it makes sense to have a scaling according to the number of pixels in the image. Otherwise, if you keep the same tau and go from MNIST / CIFAR-10 to ImageNet, the flow loss will typically be 100 times larger (while the scale of the adversarial loss will not change).
I would like to offer my results so far:
- Targeting the second-likeliest label I get an ASR of ~40%. However, if I include the images where stAdv converged to the decision boundary the ASR goes as high as ~80%. If I target all labels individually and count the fraction of test images for which at least one targeted attack succeeds then the ASR increases to ~90% (100% including decision boundary). As you mentioned @berangerd, Xiao et al. don't define how they measure their ASR, which is rather unfortunate. (All results are for model A)
- As Xiao et al. write here I tried using the Adam solver to speed up the runtime. By tuning the learning rate I get an ASR of ~96% (all beyond the decision boundary) which is better but not 100% as they claim. This raises the question if the parameters you chose for the LBFGS-B are optimal @berangerd? For example, is there a particular reason why you defined
default_extra_kwargs
the way you did? Also Xiao et al. are talking about a learning rate for LBFGS whichscipy
doesn't provide.
@hhhzzj did you hear back from the Xiao et al.?
Thanks for the feedback @anianruoss. For the L-BFGS-B algorithm, the wrapper in stadv.optimization.lbfgs
is using the default parameters from scipy.optimize.fmin_l_bfgs_b
except for:
- factr=10 (SciPy's default is 1e7)
- m=20 (SciPy's default is 10)
I have not made extensive tests with the parameters of scipy.optimize.fmin_l_bfgs_b
. The idea was simply to go for high accuracy (while being a bit slow).
Also, I didn't know that Xiao et al. have tried using Adam. I haven't tried comparing with the first-order optimizers already implemented in TensorFlow (but it is straightforward to do, and will be much simpler and faster than using stadv.optimization.lbfgs
as everything will be defined in the graph directly) !
Ok, thank you! On a more general note, it makes sense to have a scaling according to the number of pixels in the image. Otherwise, if you keep the same tau and go from MNIST / CIFAR-10 to ImageNet, the flow loss will typically be 100 times larger (while the scale of the adversarial loss will not change).
Thank you, what is the setting of tau on ImageNet?