Distilling-Object-Detectors
Distilling-Object-Detectors copied to clipboard
training feature apdatation conv2d
Hi, I saw in the paper that to match student feature map size to teacher counterpart, we must use a conv2d to upscale student's feature map. However, I was confused about whether the conv2d block is trained with the student. If it is, I think that conv2d is non-contributed to the student since it is not used when we take the student for prediction. If not, the conv2d may not extract the traits of the student's feature map for the best KD. Can you explain this to me, please?