Teacher-free-Knowledge-Distillation Question about the loss function of Tf-reg KD

Question about the loss function of Tf-reg KD

Open HowieMa opened this issue 3 years ago • 1 comments

Hi, thank you for sharing such an awesome project. For the TF-reg KD, in line 47 of my_loss_function.py, should we also divide the temperature T on the output variable, like: loss_soft_regu = nn.KLDivLoss()(F.log_softmax(outputs / T, dim=1), F.softmax(teacher_soft/T, dim=1))*params.multiplier

As in Eq (9) of your paper, the loss function is $$D_{KL}(p^d_\tau, p_\tau)$$.

I would really appreciate it if you could help me. Look forward to your reply, thanks!

Mar 07 '21 23:03 HowieMa

I am also very confused about this issue, looking forward to the author's answer #19 answer your question

Apr 07 '22 07:04 DLoveS1314

Teacher-free-Knowledge-Distillation Teacher-free-Knowledge-Distillation copied to clipboard

Question about the loss function of Tf-reg KD

Teacher-free-Knowledge-Distillation
Teacher-free-Knowledge-Distillation copied to clipboard