TopicalChange icon indicating copy to clipboard operation
TopicalChange copied to clipboard

can this model find cosine similarity between two paragraphs

Open desis123 opened this issue 1 year ago • 1 comments

I was just wondering can this https://huggingface.co/dennlinger/roberta-cls-consec model perform to find cosine / dot similarities between two paragraph of text . Like sentenceBert can perform cosine similarities between two sentences?

desis123 avatar Nov 11 '22 13:11 desis123

Hi @desis123, By default, I would say it cannot. Our models were trained with a combined input setting (i.e., two paragraphs fed into the same forward pass, separated by a [SEP] token.
In comparison, late interaction models (or more generally, dual encoders) are not processing two, but one paragraph at a time. Therefore, I would argue that our model is not particularly suited towards producing meaningful embeddings.

Best, Dennis

dennlinger avatar Nov 11 '22 13:11 dennlinger