lingvo icon indicating copy to clipboard operation
lingvo copied to clipboard

distillation loss does not decrease during training

Open by2101 opened this issue 6 years ago • 0 comments

Dear all,

I am using knowledge distillation training for ASR with lingvo. However, the distillation_loss (cross entropy between the teacher and the student) increases rather than decrease. I am confused for this. Do you observe this phenomenon? The loss is decreased when I use pytorch.

by2101 avatar Aug 19 '19 14:08 by2101