MSDA
MSDA copied to clipboard
Question about Unsupervised Training
Hi, there I am glad to make the first issue in this repo. I've read your code and have a question about the training process. In the original paper, ''Multiple Source Domain Adaptation with Adversarial Learning '', they said it's unsupervised learning, which means your input data should not have labels of target domain. However, in your repo, half of the combined data is target domain data and you train both source/target data to get Crossentropy Loss. Is it wrong with your code or just I misunderstood?
Hi, thank you for your question. It is indeed unsupervised domain adaptation and I didn't use target label in training. In the code of model/MDAN.py line69~74 """ source_labels = lambda: tf.concat([ tf.slice(self.y, [0, 0], [batch_size // 6, -1]), tf.slice(self.y, [batch_size // 3, 0], [batch_size // 6, -1]), tf.slice(self.y, [2 * batch_size // 3, 0], [batch_size // 6, -1]) ], 0) self.classify_labels = tf.cond(self.train, source_labels, all_labels) """
I did a slice to the label, where only source labels are extracted in training phase.
Hi, thank you for your question. It is indeed unsupervised domain adaptation and I didn't use target label in training. In the code of model/MDAN.py line69~74 """ source_labels = lambda: tf.concat([ tf.slice(self.y, [0, 0], [batch_size // 6, -1]), tf.slice(self.y, [batch_size // 3, 0], [batch_size // 6, -1]), tf.slice(self.y, [2 * batch_size // 3, 0], [batch_size // 6, -1]) ], 0) self.classify_labels = tf.cond(self.train, source_labels, all_labels) """
I did a slice to the label, where only source labels are extracted in training phase.
懂了,谢老哥。 Another question is after several epochs of training, why the domain accuracy is still high? The d_loss and pred_loss only change in a small scale.
You're welcome. I personally think that is an very interesting and open question. I think one possible reason is that: the goal of this architecture is to make target feature similar to each of the source feature, but sometimes it is just too hard to do so with the model and the losses because the differences between sources and target. For example, we want the feature of mnistm(target) data to be similar to all of mnist, svhn and synth(sources), but the 3 sources and 1 target are so different with each other. So the result is that the target feature can only be similar to some of the sources, but not necessarily all, given only one encoder.