What pre-trained models do you use to retrieve negative embeddings?

Open junleiz opened this issue 3 years ago • 1 comments

Hello, thank you for releasing your code. When I want to reproduce your results, I only get 78.6 for RoBERTa_large. I noticed that you do not mention what pre-trained models do you use to retrieve negative embeddings. I used the unsupervised-SimCSE-RoBERTa. Other settings are not changed. Could you please tell me the pre-trained models you used？

Aug 14 '22 19:08 junleiz

Sorry for this late reply.

We used a pretrained RoBERTa-large to calculate the embeddings of all samples, which were then used to calculate the similarities for negative sampling. This step is done BEFORE using SimCSE to continue training the RoBERTa-large, thus the embeddings and similarities were kept static throughout training.

Oct 20 '22 07:10 richardwth