CCT
CCT copied to clipboard
Why use MSE instead of CE and KL divergence
Hello, thank you for your excellent work. I have a question. Regarding the unsupervised loss of equation 2 in the paper, why is MSE (mean squared error) used for loss calculation instead of CE or KL divergence. Looking forward to receiving your reply.