FGD
FGD copied to clipboard
Focal and Global Knowledge Distillation for Detectors (CVPR 2022)
I wrote down implementation of FGD for ultralytics on this Medium Post. Even though I can not surpass the student's performance, this method is really impressive to me. Thanks. If...
作者您好,感谢您开源该工作。 代码中,关于area的计算如下: area = 1.0//(hmax[i].view(1,-1)+1-hmin[i].view(1,-1))//(wmax[i].view(1,-1)+1-wmin[i].view(1,-1)) 其中,(hmax[i].view(1,-1)+1-hmin[i].view(1,-1))和(wmax[i].view(1,-1)+1-wmin[i].view(1,-1))表示GT在特征图上的高和宽。因为它们都是大于1的数,而代码又通过//取整,因此area理论上为0。 我print了fgd.py文件中area的值,显示为: area: tensor([[0., 0., 0., 0., 0.]], device='cuda:1') 那么,根据下面的代码: Mask_fg[i][hmin[i][j]:hmax[i][j]+1, wmin[i][j]:wmax[i][j]+1] = \ torch.max(Mask_fg[i][hmin[i][j]:hmax[i][j] + 1, wmin[i][j]:wmax[i][j] + 1], area[0][j].float()) 得出Mask_fg也全部为0,根据论文中式(9)则理论上Lfea的值为0。 因此,focal distillation实际上是没有参与训练的吗?