Mesopotamia

Results 17 comments of Mesopotamia

It's used to balance the losses and FGD is not very sensitive to them. You can change the hyper-parameters according to the loss scale.

The smallest distillation loss scale is similar to the original classification loss or regression loss may be fine.

yes, 0.1-2 may be fine too

You can keep the ratio between different loss and change them together to balance loss. You may also check whether the teacher load the weight successfully.

There might be something wrong in your distillation environment. You can test your environment on COCO to compare the results.

> I mean the distillation config file from the yolox-l to yolox-m Please refer README. It is in branch yolox.

Could you share your log? We don't try such setting yet.

It seems the grad explodes. Maybe tha gap between them is too largre. If the loss keeps too large, you can try to decrease the grad_norm to avoid explosion.

Fine, I don't know the reason either. However, RepPoints can be trained with DCN in the [congig](https://github.com/yzd-v/FGD/blob/master/configs/distillers/fgd/fgd_reppoints_rx101_64x4d_distill_reppoints_r50_fpn_2x_coco.py)

The original loss with distillation should be much smaller than the original loss when training the student directly.