fixmatch
fixmatch copied to clipboard
Why not compute consistency on the raw features or predictions directly
Hi All,
Thanks for the nice work.
I have a question regarding the depiction in Figure 1. Why do compute the consistency loss after sharpening the predictions? Why not minimize a form of KL divergence from the model features or raw predictions. Did you observe that the sharpened form lead to better training? Or what was the rationale?
Thanks!