barlowtwins
barlowtwins copied to clipboard
Will BarlowTwins overfit on the training data?
Hi, team members of BarlowTwins. I have a question about the framework of this algorithm.
When the output of projection layers grows, the total trainable parameters grow hugely. Does any overfitting situation occur?
Also, in the paper, why the results of transfer learning is not so good compared to supervised transfer learning? Could it be the reason that BarlowTwins may over focus on the features of training data?
Look forward to your reply, thank you.