fastembed
fastembed copied to clipboard
"sentence-transformers/all-MiniLM-L6-v2" - incorrect embeddings and rather slow speedup.
I wrote a small unit test. Your models seem to have a couple of issues:
- Inconsistency of created embedding vs Sentence transformers (the sentences are different) - wrong conversion?
- No onnx-gpu
sentence_transformers=2.22
fastembed=0.5.0
torch=2.0.0
import json
import timeit
import numpy as np
from sentence_transformers import SentenceTransformer
from fastembed.embedding import FlagEmbedding
model_name_or_path="sentence-transformers/all-MiniLM-L6-v2"
model_fast = FlagEmbedding(model_name_or_path)
model_st = SentenceTransformer(model_name_or_path)
sample_sentence = [f"{list(range(i))} " for i in range(64)]
got = np.stack(list(model_fast.embed(sample_sentence)))
want = model_st.encode(sample_sentence, normalize_embeddings=True)
# FAILS here Mismatched elements: 24384 / 24576 (99.2%)
np.testing.assert_almost_equal(
got, want
)
# 2.0177175840362906 vs 2.4251126241870224
print(
timeit.timeit(lambda: list(model_fast.embed(sample_sentence)), number=10), "vs",
timeit.timeit(lambda: model_st.encode(sample_sentence, normalize_embeddings=True), number=10))
Thanks for flagging this. Will investigate.
First guess would be that these are two different models possibly since FlagEmbedding is different from Sentence Transformers.
fastembed=0.5.0
please try v0.1.1
On my system the code above still vails with v0.1.1 Have you tried the above code?
@NirantK For models, i use "sentence-transformers/all-MiniLM-L6-v2" on both sides.
@NirantK
sentence-transformers=2.22 fastembed=0.1.1
sentence = ["This is a test sentence."]
arrays are not almost equal to 1 decimals
Mismatched elements: 2 / 384 (0.521%)
Max absolute difference: 0.81547204
Max relative difference: 2334.82220783
x: array([ 1.4e-02, -1.9e-02, 6.3e-03, 3.0e-02, 1.8e-02, -1.5e-02,
-8.6e-03, 1.3e-02, 1.1e-02, -4.0e-03, -6.7e-04, 7.2e-03,
5.4e-03, 1.2e-02, 1.5e-03, -4.8e-03, 1.8e-02, -1.6e-02,...
y: array([ 8.4e-02, 5.8e-02, 4.5e-03, 1.1e-01, 7.1e-03, -1.8e-02,
-1.7e-02, -1.5e-02, 4.0e-02, 3.3e-02, 1.0e-01, -4.7e-02,
6.9e-03, 4.1e-02, 1.9e-02, -4.1e-02, 2.4e-02, -5.7e-02,...
FYI for "BAAI/bge-base-en" i get a cosine_sim of ~0.999. For "sentence-transformers/all-MiniLM-L6-v2" its around 0.223
Hey!
I've not done a thorough analysis, but I've also had some really quirky results with the fastembed embeddings as well. Some similarity scores don't make much sense at all, so I speculate if the embeddings are incorrect.
Hey, I can confirm that the sentence-transforms quantization isn't perfect. The cosine similarity is lower than we'd like. The retrieval performance doesn't see too much degradation in a small test that I ran, but yes — this is an important issue.
Thanks for flagging this.
I am curious if this is still present? I want to use fastembed in my docker containers, but not sure if that's feasible with the mismatches
So, this fix must resolve this issue