TorchDistiller Is it better to combine CWD loss with other losses than just CWD loss?

Is it better to combine CWD loss with other losses than just CWD loss?

Open jialeli1 opened this issue 3 years ago • 0 comments

Hi!

As mentioned in README: To train a model with channel-wise distillation, GAN loss and Pixel-wise distillation.

Is it better to combine CWD loss with other losses than just CWD loss?

Dec 03 '21 06:12 jialeli1