dwt-domain-adaptation
dwt-domain-adaptation copied to clipboard
Code to replicate the paper results
Hi, First of all, nice work! I would like to obtain the source code to replicate the other experiments, such as the MNIST <-> SVHN. Could you make it available? Thanks.
Hi, I am also struggling to replicate the results from the papers on the digit datasest, firstly on the easiest MNIST - USPS adaptation.
I think there are two things that can change from the implementation on Office from the repo:
-
The network. The architecture is clearly detailed in the supplementary material, except for the location of the DWT layers. Indeed, the paper states that BN layers should be replaced by DWT, but the network extracted for DANN paper doesn't contain such BN layers. I assumed the correct way would be to add DWT after each convolutional layer like this : conv -> DWT -> relu -> pool. Is it correct ?
-
The transform applied on the images. For the experience using simple entropy loss, I only resized USPS to 28x28, and normalized by (0.5, 0.5). To use the MEC loss, I used Gaussian Blur and Affine Translation on the duplicate target, as I don't think horizontal flip makes sens for digits. Is it the transformation used for the experiences of the paper ?
Many thanks !
Hi, I will try to upload the code for MNIST->USPS by this weekend.
Thank you !
Hi @vcoyette ,
I have uploaded the code for usps <-> mnist. This implementation contains DWT with Entropy loss. But you can easily replace the entropy loss with the MEC loss following the implementation for the office-home dataset.
-
Yes you are right, thats the order of the operations. For the usps <-> mnist case we use DWT layers for the first two conv layers and then use BN based alignment layers for the FC layers. We may have missed to report these details in the paper.
-
Horizontal flip is not used for digits experiments. Rest are the same.
Hi @roysubhankar,
That is indeed the BN alignment I was missing after the FC layers. Without it, the network rapidly get stuck in predicting the same number on the target dataset (e.g always 4, or always 5). That's probably the network overfitting to mimimize entropy . I tried to reduce the lambda on the entropy loss, which resolves the problem, but the results are not as good as with the BN (~86-87%). I didn't really optimize the value of lambda though.
Anyway, thanks for taking the time to answer, and congratulation, these are amazing results !