fastembed icon indicating copy to clipboard operation
fastembed copied to clipboard

"sentence-transformers/all-MiniLM-L6-v2" - incorrect embeddings and rather slow speedup.

Open michaelfeil opened this issue 2 years ago • 7 comments

I wrote a small unit test. Your models seem to have a couple of issues:

  • Inconsistency of created embedding vs Sentence transformers (the sentences are different) - wrong conversion?
  • No onnx-gpu
sentence_transformers=2.22
fastembed=0.5.0
torch=2.0.0
import json
import timeit

import numpy as np
from sentence_transformers import SentenceTransformer
from fastembed.embedding import FlagEmbedding

model_name_or_path="sentence-transformers/all-MiniLM-L6-v2"

model_fast = FlagEmbedding(model_name_or_path)
model_st = SentenceTransformer(model_name_or_path)

sample_sentence = [f"{list(range(i))} " for i in range(64)]

got = np.stack(list(model_fast.embed(sample_sentence)))
want = model_st.encode(sample_sentence, normalize_embeddings=True)

# FAILS here Mismatched elements: 24384 / 24576 (99.2%)
np.testing.assert_almost_equal(
    got, want
)

# 2.0177175840362906 vs 2.4251126241870224
print(
timeit.timeit(lambda: list(model_fast.embed(sample_sentence)), number=10), "vs",
timeit.timeit(lambda: model_st.encode(sample_sentence, normalize_embeddings=True), number=10))

michaelfeil avatar Oct 12 '23 21:10 michaelfeil

Thanks for flagging this. Will investigate.

First guess would be that these are two different models possibly since FlagEmbedding is different from Sentence Transformers.

NirantK avatar Oct 16 '23 12:10 NirantK

fastembed=0.5.0

please try v0.1.1

generall avatar Oct 19 '23 15:10 generall

On my system the code above still vails with v0.1.1 Have you tried the above code?

@NirantK For models, i use "sentence-transformers/all-MiniLM-L6-v2" on both sides.

michaelfeil avatar Oct 22 '23 11:10 michaelfeil

@NirantK

sentence-transformers=2.22 fastembed=0.1.1

sentence = ["This is a test sentence."]


arrays are not almost equal to 1 decimals

Mismatched elements: 2 / 384 (0.521%)
Max absolute difference: 0.81547204
Max relative difference: 2334.82220783
 x: array([ 1.4e-02, -1.9e-02,  6.3e-03,  3.0e-02,  1.8e-02, -1.5e-02,
       -8.6e-03,  1.3e-02,  1.1e-02, -4.0e-03, -6.7e-04,  7.2e-03,
        5.4e-03,  1.2e-02,  1.5e-03, -4.8e-03,  1.8e-02, -1.6e-02,...
 y: array([ 8.4e-02,  5.8e-02,  4.5e-03,  1.1e-01,  7.1e-03, -1.8e-02,
       -1.7e-02, -1.5e-02,  4.0e-02,  3.3e-02,  1.0e-01, -4.7e-02,
        6.9e-03,  4.1e-02,  1.9e-02, -4.1e-02,  2.4e-02, -5.7e-02,...

michaelfeil avatar Oct 31 '23 13:10 michaelfeil

FYI for "BAAI/bge-base-en" i get a cosine_sim of ~0.999. For "sentence-transformers/all-MiniLM-L6-v2" its around 0.223

michaelfeil avatar Oct 31 '23 14:10 michaelfeil

Hey!

I've not done a thorough analysis, but I've also had some really quirky results with the fastembed embeddings as well. Some similarity scores don't make much sense at all, so I speculate if the embeddings are incorrect.

nleroy917 avatar Nov 13 '23 21:11 nleroy917

Hey, I can confirm that the sentence-transforms quantization isn't perfect. The cosine similarity is lower than we'd like. The retrieval performance doesn't see too much degradation in a small test that I ran, but yes — this is an important issue.

Thanks for flagging this.

NirantK avatar Nov 15 '23 08:11 NirantK

I am curious if this is still present? I want to use fastembed in my docker containers, but not sure if that's feasible with the mismatches

nleroy917 avatar May 27 '24 17:05 nleroy917

So, this fix must resolve this issue

I8dNLo avatar Jun 24 '24 13:06 I8dNLo