mmpose
mmpose copied to clipboard
[Docs] Reproducing the Distillation Process in RTMW
📚 The doc issue
In Section 3.1.3 of the RTMW paper, the authors @Tau-J mentioned:
“We adopted the two-stage distillation technique used by DWPose during the model training process to further enhance the model’s performance.”
However, after reviewing both the RTMW and DWPose configuration files, it’s not clear how this distillation process is implemented in RTMW. Specifically, I couldn’t find the training config setup that corresponds to this two-stage distillation.
If I want to reproduce RTMW’s results through my own training, should I first train a teacher model using the RTMW configs, and then modify the DWPose distillation configs (it is currently for RTMPose) to train a student model?
I also checked the RTMW-related README files but didn’t find further documentation on this process. If there are any additional instructions or references for implementing the distillation training pipeline, I would really appreciate it.
Suggest a potential alternative/fix
I can fill in the documentation and distillation configs once I’ve figured out the exact steps to reproduce the RTMWPose training and distillation steps.