fastembed-js icon indicating copy to clipboard operation
fastembed-js copied to clipboard

Different embedding vectors results than with Python version

Open rklf opened this issue 7 months ago • 6 comments

Hi! First, thank you for your work. However, I'm getting different vector results when embedding a exact same query with the exact same model (and same max_length : default to 512 on both) than when using FastEmbed Python version. Is it intended? Am I doing something wrong?

For example for hello query using fast-multilingual-e5-large (EmbeddingModel.MLE5Large) :

fastembed-js (Node-JS) :

import { EmbeddingModel, FlagEmbedding } from "fastembed";
...
this._embedder = await FlagEmbedding.init({
  model: EmbeddingModel.MLE5Large,
  maxLength: 512,
  cacheDir: "./local_cache",
  showDownloadProgress: false,
});
results = await this.embedder.queryEmbed("hello");

// results
{
    "0": 0.04752354323863983,
    "1": -0.0058815134689211845,
    "2": 0.0006205342360772192,
    "3": -0.03057778812944889,
    "4": 0.038045115768909454,
    "5": -0.02664419822394848,
    "6": -0.018038146197795868,
    "7": 0.03404559567570686,
    ...
}

(also not related but queryEmbed() doesn't return number[] as TS types says, but structure like above, so I use Object.values() to get number[])

fastembed (Python):

from langchain_community.embeddings import FastEmbedEmbeddings
...
embedding_function = FastEmbedEmbeddings(
  model_name="intfloat/multilingual-e5-large",
  cache_dir=".cache",
)
results = embedding_function.embed_query("hello")

# results
[
  0.6659826636314392,
  -0.13676010817289352,
  -0.41865870729088783,
  -1.7572288811206818,
  0.4101845696568489,
  -1.075161725282669,
  -0.4985295459628105,
  0.9173809438943863,
  ...
]

This results to a lower score when searching through .search() using QdrantClient (@qdrant/js-client-rest). Also, I'm using a micro-service that add documents to Qdrant vector database, which use embed_query() on Python-side. Output would be currently more accurate when embedding query from Python-side (since it would use same function), but I'd like to be free from Python-side and embed user queries on JS-side (unfortunately, rn embedded vectors differ for an exact same query).

export class QdrantService extends QdrantClient implements OnModuleInit {
  constructor() {
    super({
      url: process.env.QDRANT_CLUSTER_URL,
      apiKey: process.env.QDRANT_API_KEY,
    });
  }
  ...

  async searchQdrant(...) {
    ...
    return await this.search(collectionName, {
      vector: {
        name: vectorName,
        vector: embedding,
      },
      limit: limit,
      filter: filter,
      with_payload: true,
      with_vector: false,
    })
  }

Python left | JS right Image

rklf avatar Apr 30 '25 15:04 rklf

Hey @rklf. Can you try with BAAI/bge-base-en-v1.5?

Anush008 avatar Apr 30 '25 16:04 Anush008

@Anush008 Just tried now, it happens too.

For query hello : Python: https://pastebin.com/raw/WHirXXpU

embedding_function = FastEmbedEmbeddings(
  model_name="BAAI/bge-base-en-v1.5",
  cache_dir=".cache",
)

JS: https://pastebin.com/raw/Ui7kiKkU

this._embedder = await FlagEmbedding.init({
  model: EmbeddingModel.BGEBaseENV15,
  maxLength: 512,
  cacheDir: "./local_cache",
  showDownloadProgress: false,
});
await this.embedder.queryEmbed(text)

rklf avatar Apr 30 '25 16:04 rklf

That's unexpected. We do have tests for canonical values same as the Python implementation.

PYTHON: https://github.com/qdrant/fastembed/blob/b785640bd5081d83f76b1aee633ef2ae5bdd8f3c/tests/test_text_onnx_embeddings.py#L22-L24

NODE: https://github.com/Anush008/fastembed-js/blob/e4e2fdcaa41a60b337a3c7cab514e334f2f718cb/tests/fastembed_bgebase_v15.test.ts#L94

Anush008 avatar Apr 30 '25 16:04 Anush008

There might be an issue then. Can you try on your own?

rklf avatar Apr 30 '25 16:04 rklf

@Anush008 Did you get time to have a look into this?

rklf avatar May 03 '25 23:05 rklf

I am OOO currently.

Anush008 avatar May 04 '25 02:05 Anush008