DefangChen
DefangChen
Thanks for your contribution! I will check these two papers later (perhaps next week)
I have skimmed through these two papers. Here are my brief comments: 1. I like the concept of weight-inherited distillation. The weights in a neural network should also embody another...
“You mean that the FKL consider the head part as prior while RKL consider the tail part as piror in KD are widely known and easily accessible in standard machine...
I did not mean to challenge the value of your paper. I just point out some facts.