vat_tf icon indicating copy to clipboard operation
vat_tf copied to clipboard

Maybe a wrong loss function

Open LaughingGo opened this issue 5 years ago • 3 comments

Hello, Takeru, Thanks for your great works. I find that there may be an error with your code, at the 46th line in vat.py: " dist = L.kl_divergence_with_logit(logit_p, logit_m)", where I think you may add a negative sign before KL_divergence, because here we want to maximize the distance to get the virtual adversarial direction. Am I right?

LaughingGo avatar May 06 '19 07:05 LaughingGo

Hi,

It's not wrong. The "positive" gradient of dist is the direction whose maximizes dist (i.e. KL divergence). I speculate that you are confused it with the gradient descent algorithm, in which we add "negative" gradient to a variable.

takerum avatar May 08 '19 02:05 takerum

Hi, Takeru,

Thanks for your kind reply, I was indeed confused it with the GD algorithm.

But I still have another question as following: according to the code, we know that the 'd' is randomly initialized first, and the gradient 'grad' on current 'd' is calculated, then this gradient 'grad' is taken as the 'r_vadv'. While my point is that we should take the 'd+grad' as 'r_vadv', because the summation of these two vector is actually the adversarial direction against the current sample x. Do you think so?

Looking forward to your reply, thanks again!

LaughingGo avatar May 08 '19 08:05 LaughingGo

Right, that would be another option for estimating the adversarial perturbation and might improve the performance. The code is the implementation of the power iteration method which we use to estimate the most vulnerable direction. See Section 3.3 in the paper https://arxiv.org/pdf/1704.03976.pdf.

takerum avatar May 09 '19 05:05 takerum