fastembed "sentence-transformers/all-MiniLM-L6-v2" - incorrect embeddings and rather slow speedup.

I wrote a small unit test. Your models seem to have a couple of issues:

Inconsistency of created embedding vs Sentence transformers (the sentences are different) - wrong conversion?
No onnx-gpu

sentence_transformers=2.22
fastembed=0.5.0
torch=2.0.0

import json
import timeit

import numpy as np
from sentence_transformers import SentenceTransformer
from fastembed.embedding import FlagEmbedding

model_name_or_path="sentence-transformers/all-MiniLM-L6-v2"

model_fast = FlagEmbedding(model_name_or_path)
model_st = SentenceTransformer(model_name_or_path)

sample_sentence = [f"{list(range(i))} " for i in range(64)]

got = np.stack(list(model_fast.embed(sample_sentence)))
want = model_st.encode(sample_sentence, normalize_embeddings=True)

# FAILS here Mismatched elements: 24384 / 24576 (99.2%)
np.testing.assert_almost_equal(
    got, want
)

# 2.0177175840362906 vs 2.4251126241870224
print(
timeit.timeit(lambda: list(model_fast.embed(sample_sentence)), number=10), "vs",
timeit.timeit(lambda: model_st.encode(sample_sentence, normalize_embeddings=True), number=10))

Oct 12 '23 21:10 michaelfeil

Thanks for flagging this. Will investigate.

First guess would be that these are two different models possibly since FlagEmbedding is different from Sentence Transformers.

Oct 16 '23 12:10 NirantK

fastembed=0.5.0

please try v0.1.1

Oct 19 '23 15:10 generall

On my system the code above still vails with v0.1.1 Have you tried the above code?

@NirantK For models, i use "sentence-transformers/all-MiniLM-L6-v2" on both sides.

Oct 22 '23 11:10 michaelfeil

@NirantK

sentence-transformers=2.22 fastembed=0.1.1

sentence = ["This is a test sentence."]


arrays are not almost equal to 1 decimals

Mismatched elements: 2 / 384 (0.521%)
Max absolute difference: 0.81547204
Max relative difference: 2334.82220783
 x: array([ 1.4e-02, -1.9e-02,  6.3e-03,  3.0e-02,  1.8e-02, -1.5e-02,
       -8.6e-03,  1.3e-02,  1.1e-02, -4.0e-03, -6.7e-04,  7.2e-03,
        5.4e-03,  1.2e-02,  1.5e-03, -4.8e-03,  1.8e-02, -1.6e-02,...
 y: array([ 8.4e-02,  5.8e-02,  4.5e-03,  1.1e-01,  7.1e-03, -1.8e-02,
       -1.7e-02, -1.5e-02,  4.0e-02,  3.3e-02,  1.0e-01, -4.7e-02,
        6.9e-03,  4.1e-02,  1.9e-02, -4.1e-02,  2.4e-02, -5.7e-02,...

Oct 31 '23 13:10 michaelfeil

FYI for "BAAI/bge-base-en" i get a cosine_sim of ~0.999. For "sentence-transformers/all-MiniLM-L6-v2" its around 0.223

Oct 31 '23 14:10 michaelfeil

Hey!

I've not done a thorough analysis, but I've also had some really quirky results with the fastembed embeddings as well. Some similarity scores don't make much sense at all, so I speculate if the embeddings are incorrect.

Nov 13 '23 21:11 nleroy917

Hey, I can confirm that the sentence-transforms quantization isn't perfect. The cosine similarity is lower than we'd like. The retrieval performance doesn't see too much degradation in a small test that I ran, but yes — this is an important issue.

Thanks for flagging this.

Nov 15 '23 08:11 NirantK

I am curious if this is still present? I want to use fastembed in my docker containers, but not sure if that's feasible with the mismatches

May 27 '24 17:05 nleroy917

So, this fix must resolve this issue

Jun 24 '24 13:06 I8dNLo

fastembed fastembed copied to clipboard

"sentence-transformers/all-MiniLM-L6-v2" - incorrect embeddings and rather slow speedup.

fastembed
fastembed copied to clipboard