How are target domain class labels handled?
Hey!
Thanks a lot for posting the code! I have one thing that I'm trying to understand. During training, we show the model a mix of both source domain data that have class labels, and target domain data that don't have them. How are missing class labels for target domain data handled? Do they have some distinct value (like -1)?
Hi @moganesyan ! It's been a while since I worked on this, but if i remember correctly the label doesn't matter since the loss for those examples in the classifier branch does not contribute to the total loss (it's multiplied by an all-zero vector). In my case I did have the labels for the target domain, which was how in the end the performance was evaluated; but I "hid" them from the model in that way. If you don't have the labels, and you are not interested in the performance on the target domain (although how else are you gonna evaluate the model?), you can just set them to whatever, if you mask them in the same way.
Hi @michetonu
Thanks for the response and explanation. I have one last question. Where in the code is the 'masking' of the loss function done for target domain samples? I'm talking about this bit that you mentioned: 'the loss for those examples in the classifier branch does not contribute to the total loss (it's multiplied by an all-zero vector)'
I've looked at the code in this repo and could not spot anything that matches the above description. The loss function seems to be the standard categorical cross entropy with no modifications.
I would really appreciate it if you could point me in the right direction.
Once again, thanks for the help!
@moganesyan you are right, it's not in there! It must have gotten lost while I merged a few files and cleaned the code. You can just assign the loss weights of the target domain observations in the same way that we assign the time-decaying weights. Just set the weights to zero for all the relevant observations. Don't forget to make your batches 50/50 (or not too unbalanced anyways).