knowledge-distillation-keras icon indicating copy to clipboard operation
knowledge-distillation-keras copied to clipboard

correction for soft targets loss may be needed

Open arsenyinfo opened this issue 6 years ago • 1 comments

Quoting original paper (block 2):

Since the magnitudes of the gradients produced by the soft targets scale as 1/T2 it is important to multiply them by T2 when using both hard and soft targets.

It looks like this correction is not included in your knowledge_distillation_loss.

arsenyinfo avatar Jun 11 '18 12:06 arsenyinfo

Quoting original paper (block 2):

Since the magnitudes of the gradients produced by the soft targets scale as 1/T2 it is important to multiply them by T2 when using both hard and soft targets.

It looks like this correction is not included in your knowledge_distillation_loss.

@arsenyinfo Do you know the place where correction is needed?

sub1120 avatar May 22 '21 10:05 sub1120