unilm
unilm copied to clipboard
Can I use L2 to calc the distance between the 2 embeddings created from e5-base-v2?
Describe I am using model e5-base-v2, I have seen the doc in the https://huggingface.co/intfloat/e5-base-v2, the doc says the cosine similarity scores distribute around 0.7 to 1.0.
how I use the e5-base-v2 model?
-
- Get 2 embeddings from e5-base-v2
-
- Use torch.nn.functional.normalize(sentence_embeddings, p=2, dim=1) to normalize the embeddings
-
- Compare 2 embeddings using L2
My questions:
-
- Is it a right way? Can I use L2 to calc the distance between the 2 embeddings created from e5-base-v2?
-
- If we use the cosine similarity, need I normalize the embeddings?
-
- If the threshold of the entire e5-base-v2 is [0.7,1], is there a suitable range for the relatively similar areas?
@intfloat Can you help me?