adda icon indicating copy to clipboard operation
adda copied to clipboard

Adversarial loss

Open Yuliang-Zou opened this issue 7 years ago • 4 comments

Thanks for sharing the source code!

But I have some questions here:

  1. In adda/tools/train_adda.py line88-103: seems that you set source domain label as 1, and target domain label as 0? But here source distribution is fixed, while target distribution is updated. According to the original setting in the GAN paper, the changing one should have the zero label. And the equation (9) in the ADDA paper also shows that the ground truth label of source should be 1.

  2. How do you separate your encoder and classifier in the base model? In the paper, it seems that the encoder only contains CNN. But, in the code seems that you take the whole network as a encoder, then where is the classifier?

Thanks for your time. Looking forward to your response.

Yuliang-Zou avatar Apr 06 '17 02:04 Yuliang-Zou

Hi, thanks for sharing the source code, too!

But, I faced the same problem as @Yuliang-Zou said above. Therefore, I slightly modify the code to separate the encoder and classify by changing the feature representation from 'fc4' to 'fc3' which will feed into discriminator to do adversarial training in adda/tools/train_adda.py line96-97. And finally, it can work except for the experiment of SVHN->MNIST. Can someone share the parameter setting of this experiment if you succeed?

Thanks.

BoChengTsai avatar Jun 15 '17 06:06 BoChengTsai

I am trying to implement the same (SVHN->MNIST) in PyTorch and I cannot get the target to converge, perhaps because of same issue as what @BoChengTsai and @Yuliang-Zou mentioned. Any suggestions?

aabbas90 avatar Jul 24 '17 19:07 aabbas90

@Yuliang-Zou afaik, it does not make any difference; if you expand the loss expression, these two should be equivalent

it also took me some time to implement it in pytorch, here is a check list I wrote in the process, maybe it could be useful to some folks here

+ downscale everything to (28, 28)
+ SVHN reduce to 1 channel
+ same batch size 128
+ shuffle=True
+ learning rates for everything during adaptation (adam): 0.0002, 0.5, 0.999
+ lenet same shapes/kernels/strides, valid padding, 500 in hidden, relu
+ weight_decay on everything 2.5e-5
+ initialization with Xavier on everything
+ initialize target_embedding from source_embedding
+ adversarial classifier with shape [500, 500] + leaky relu with 0.1 gain
+ adversarial logits shape is (n, 2) # upd: binary loss also works
+ min cr_ent(y, logits) and min cr_ent(1-y, logits); y_i in {0, 1}

MInner avatar Jul 31 '17 18:07 MInner

@MInner could you share your pytorch code for SVHN->MNIST ? I tried with these hyperparameters but am unable to replicate the paper's accuracies.

sushant21 avatar May 30 '18 14:05 sushant21