SimMIM
SimMIM copied to clipboard
Performance using the cosine distance
Hi @caoyue10 , thanks for your insightful work.
I found that the experiments and discussion in your paper state that different types of distance (e.g. l2, l1) in calculating the loss perform equally well. However, I would like to further know that if this still holds for the Cosine distance as well?
Since cosine distance has been prevalent in previous CL works, and it involves a l2-normalization, I think experimenting with this could be helpful. Could you shed some light on this?
Best.