MIC icon indicating copy to clipboard operation
MIC copied to clipboard

Teacher model for inference

Open kimkj38 opened this issue 9 months ago • 1 comments

Hi. I have a question about inference. As I understand, teacher model is more robust than student model by EMA update and it is used for supervision(pseudo label). Then why do you use student model for inference? What is the problem if we use teacher model for inference?

kimkj38 avatar May 03 '24 08:05 kimkj38

Hi @kimkj38,

I did not evaluate the teacher performance for MIC. However, in previous UDA studies, we found that the student and teacher have similar performance at the end of the training. The EMA teacher is particularly important in the beginning of the training, when the network is learning rather quickly and the predictions change quickly, to ensure temporally stable pseudo-labels for stable self-training. When the training converges in the end and the learning rate is decayed, there is less instability and both student and teacher converge to the similar predictions.

Best, Lukas

lhoyer avatar Jun 09 '24 18:06 lhoyer