deepJDOT icon indicating copy to clipboard operation
deepJDOT copied to clipboard

How to achieve high target accuracy given DeepJDOT limitations

Open offchan42 opened this issue 4 years ago • 2 comments

I want to improve accuracy or loss of target dataset. So I would want to ask a few questions that might affect the accuracy.

  1. If I want to increase accuracy of target, should I train with a varied source domain? E.g. to increase accuracy for MNIST (for SVHN to MNIST adaptation), should I augment SVHN to include variations like a grayscale version of SVHN, different colored SVHN, etc?
  2. What are the example source and target datasets that would achieve high target accuracy? Give an idea if you haven't experiment before.
  3. What are the example source and target datasets that would achieve low target accuracy? Give an idea if you haven't experiment before.

offchan42 avatar Jul 06 '19 13:07 offchan42

DeepJDOT might require proper initialization of target model. If you initialize the weights of the target model with source model, almost in all cases DeepJDOT works pretty well.

bbdamodaran avatar Jul 16 '19 06:07 bbdamodaran

I always set the weights of the target model with the source model and check the accuracy of the target model before training to ensure similarity with the source model.

I'm training deepJDOT with my dataset and I found it's quite difficult to ensure that DeepJDOT will improve the error. Sometimes it just increases the error after a few hundred iterations. Maybe the regression problem is more difficult than classification? This also sometimes happens with the rotated SVHN->MNIST dataset. My source is a face image in normal lighting. My target is a face image with a flashlight shining under the face. My outputs are 50 sigmoid units. (How much open is an eye, how much open is the mouth, etc)

What I've noticed also that in the feature extraction layer if I set the activation to ReLU instead of sigmoid, deepjdot will make the target error increase instead of decrease. Is this expected? I see that sigmoid trains quite slow so I wanted to change it.

  1. How do I know when to stop training (to not make the deepjdot overfits) if I don't have target labels? Because the loss that deepjdot gives doesn't seem to be correlated or directly proportional to the mean abs error. And the error seems to be increasing after some period of iterations.
  2. Does deepjdot requires a lot of target train data? What is the rule of thumb for this?
  3. For the feature extraction layer, should it always be the last layer before the final output layer? How would increasing/decreasing the number of units (128) affect the outcome? Can I use the output layer as feature extraction layer?
  4. How does batch size and learning rate affect the outcome? Do you think it's very critical or not?

If I don't have the target labels (in real unsupervised case), training with deepjdot can be quite scary and unreliable. I won't know if the error will be increased or decreased. That's why I want to increase the chance of getting it right.

Thank you for help! It's quite critical for my work.

offchan42 avatar Jul 16 '19 10:07 offchan42