stAdv About a problem of the demo

I replaced your demo's model by the paper's experiment model A in your demo,and test the model on a single image.But the produced adversarial example is different from the original image very much.The tau is 0.05.How should i do?Thank you so much!!! p qf7m n_lt _koay1 0

btzywc3 ai0l a ri 1xdp

Oct 27 '18 02:10 hhhzzj

I don't think that it looks that bad. Have a look at Fig. 2 in arXiv:1801.02612 (the original paper), the digits they have generated do not look very different from this one. (Also it depends somewhat on luck: for some digits the perturbation is more noticeable than for others.) But in this case you can try increasing tau (and stop when it no longer predicts a 3) to improve spatial smoothness.

Oct 27 '18 11:10 berangerd

I don't think that it looks that bad. Have a look at Fig. 2 in arXiv:1801.02612 (the original paper), the digits they have generated do not look very different from this one. (Also it depends somewhat on luck: for some digits the perturbation is more noticeable than for others.) But in this case you can try increasing tau (and stop when it no longer predicts a 3) to improve spatial smoothness. the expriment is not over,but the result is clearly obversed that 779 images have only 272 adversarial images attacking successfully.

Ok,thank you for your answer.I have an new problem.I calculate the attack success rates of adversarial examples generated by model A.And the result is bad.The original paper show the attack success rate is 99.95%(from the tabel 1),which confused me a period of time.

Oct 27 '18 12:10 hhhzzj

What do you mean by "bad"? Do you find a very different attack success rate? Because the example you show above would count as a successful attack (it gets classified as the target, 3).

If it's about the ASR: do you really have the exact same model as model A? (architecture? (how about dropout, commented out in your code snippet?) training procedure?) And the same procedure for calculating the ASR?

Oct 27 '18 12:10 berangerd

Yes,i have the same model as model A.The dropout layers was added in the training procedure.When i calculate the ASR,i commented out them.Although the experiment is running,the result is bad.because only 272 images was attacked successfully.The current ASR is 272/779,which is almost 1/3 images attacked successfully.

Oct 27 '18 12:10 hhhzzj

Oh,i almost forgot.when i added softmax as an activation function in the FC layers (it is the same as the original paper's table 5),i absolutely not attacked any other images successfully.So i abandon softmax,and it is effective.

Oct 27 '18 12:10 hhhzzj

Yes,i have the same model as model A.The dropout layers was added in the training procedure.When i calculate the ASR,i commented out them.Although the experiment is running,the result is bad.because only 272 images was attacked successfully.The current ASR is 272/779,which is almost 1/3 images attacked successfully.

There are a number of checks you can make:

check that you reproduce the accuracy stated in arXiv:1801.02612 on clean images. If not, you do not have the same model as model A.
make sure you have version 0.2 the library. Version 0.1 was computing something different.
investigate if the failure cases come from non-convergence or from convergence but predicting the initial label (1, in the example above). You can check the info dictionary returned by stadv.optimization.lbfgs. If you have NaN gradients try increasing the epsilon parameter of stadv.losses.flow_loss to e.g. 1e-4.

Oh,i almost forgot.when i added softmax as an activation function in the FC layers (it is the same as the original paper's table 5),i absolutely not attacked any other images successfully.So i abandon softmax,and it is effective.

Yes, as done in https://github.com/rakutentech/stAdv/blob/master/demo/simple_mnist.ipynb you should feed the logits to the stAdv-related losses, and not the output of the softmax. But did that solve your problem or is it something you had already done?

Oct 27 '18 12:10 berangerd

Firstly thank you very much for your patience.

the accuracy is 98.6% on clean images.the accuracy in tabel 1 is 98.58%.
the library's version is 0.2
the failure cases come from convergence and the gradients are normal.
yes, i solve the problem with abandoning the softmax.

the picture show the lbfgs'result info from failure cases.

Oct 27 '18 13:10 hhhzzj

Thanks for the screenshot (indeed it looks sane) and glad you solved the problem! I am closing the issue.

Oct 27 '18 13:10 berangerd

nonononono,i am not solve this problem.the result's info are normal,but the ASR is only 1/3.It is too low.I really really hope get your help.

Oct 27 '18 13:10 hhhzzj

Ah, I thought that the issue was solved. Sorry I don't have time to look into it in more detail now but let me reopen the issue for later.

Oct 27 '18 13:10 berangerd

It is ok,thank you for your patience all the same.

Oct 27 '18 13:10 hhhzzj

I have tried to reproduce your results using the demo notebook (demo/simple_mnist.ipynb), running over randomly picked test samples (and randomly picked target classes) in a loop to estimate the attack success rate (ruling out samples misclassified even with no perturbation). In this case, with tau=0.05, I obtain a success rate of about 72%. This is much larger than what you observe (but you are using a different CNN) but much smaller than what Xiao et al. report in arXiv:1801.02612 (they claim an ASR of 99.95% or more).

Here are possible reasons for this discrepancy:

A difference in the computation of the ASR: I had another look at the Xiao et al. paper, and did not find a clear definition of it. I hope that they really compute the total fraction of attacks that succeed, and not the fraction of test images for which at least one targeted attack succeeds.
A difference between what Xiao et al. used in their experiments and what they wrote in Section 3 of the paper. For example, you can notice that Eq. (4) p. 5 and Eq. (5) p. 14 are different (square root of sum vs. sum of square root; also the 1/n factor). What I have implemented in v0.2 of the library is Eq. (4) (what they claim they have used in the paper). Actually, in v0.1 of the library I had mistakenly implemented something closer to Eq. (5) for the flow loss and found higher ASR.
A subtle difference related to the optimization with L-BFGS, but I would find it very surprising that it makes such a large difference.
A bug in my code.
A bug in their code.

Since Xiao et al. have not provided a public implementation of their attack, it is difficult to pinpoint the origin of the discrepancy. Also, by simple visualization the generated adversarial examples look quite good to me. For practical purposes, you may try to decrease tau until you have the desired ASR. Or, if you are mainly concerned with reproducing the results from the Xiao et al. paper, I suggest that you contact the authors. In any case, if you have any update on that please let us know.

Oct 31 '18 14:10 berangerd

Thank you so much!The way of calculating the ASR for me is the same as you(ruling out samples misclassified even with no perturbation).I decrease tau to 0.005,and model A's ASR can promote to about 76%,which is still much smaller than what Xiao et al. claim the ASR of 99.95% or more,and their tau is 0.05.I emailed the authors about this problem several days ago,and he does not reply me yet.I will continue to contact him.If i have any update on that,i will tell you.

Nov 01 '18 13:11 hhhzzj

Ok, thank you! On a more general note, it makes sense to have a scaling according to the number of pixels in the image. Otherwise, if you keep the same tau and go from MNIST / CIFAR-10 to ImageNet, the flow loss will typically be 100 times larger (while the scale of the adversarial loss will not change).

Nov 01 '18 17:11 berangerd

I would like to offer my results so far:

Targeting the second-likeliest label I get an ASR of ~40%. However, if I include the images where stAdv converged to the decision boundary the ASR goes as high as ~80%. If I target all labels individually and count the fraction of test images for which at least one targeted attack succeeds then the ASR increases to ~90% (100% including decision boundary). As you mentioned @berangerd, Xiao et al. don't define how they measure their ASR, which is rather unfortunate. (All results are for model A)
As Xiao et al. write here I tried using the Adam solver to speed up the runtime. By tuning the learning rate I get an ASR of ~96% (all beyond the decision boundary) which is better but not 100% as they claim. This raises the question if the parameters you chose for the LBFGS-B are optimal @berangerd? For example, is there a particular reason why you defined default_extra_kwargs the way you did? Also Xiao et al. are talking about a learning rate for LBFGS which scipy doesn't provide.

@hhhzzj did you hear back from the Xiao et al.?

Dec 30 '18 13:12 anianruoss

Thanks for the feedback @anianruoss. For the L-BFGS-B algorithm, the wrapper in stadv.optimization.lbfgs is using the default parameters from scipy.optimize.fmin_l_bfgs_b except for:

factr=10 (SciPy's default is 1e7)
m=20 (SciPy's default is 10)

I have not made extensive tests with the parameters of scipy.optimize.fmin_l_bfgs_b. The idea was simply to go for high accuracy (while being a bit slow).

Also, I didn't know that Xiao et al. have tried using Adam. I haven't tried comparing with the first-order optimizers already implemented in TensorFlow (but it is straightforward to do, and will be much simpler and faster than using stadv.optimization.lbfgs as everything will be defined in the graph directly) !

Jan 02 '19 13:01 berangerd

Ok, thank you! On a more general note, it makes sense to have a scaling according to the number of pixels in the image. Otherwise, if you keep the same tau and go from MNIST / CIFAR-10 to ImageNet, the flow loss will typically be 100 times larger (while the scale of the adversarial loss will not change).

Thank you, what is the setting of tau on ImageNet?

Jan 02 '20 02:01 caojiezhang

stAdv stAdv copied to clipboard

About a problem of the demo

stAdv
stAdv copied to clipboard