fastembed-js
fastembed-js copied to clipboard
Different embedding vectors results than with Python version
Hi!
First, thank you for your work. However, I'm getting different vector results when embedding a exact same query with the exact same model (and same max_length : default to 512 on both) than when using FastEmbed Python version. Is it intended? Am I doing something wrong?
For example for hello query using fast-multilingual-e5-large (EmbeddingModel.MLE5Large) :
fastembed-js (Node-JS) :
import { EmbeddingModel, FlagEmbedding } from "fastembed";
...
this._embedder = await FlagEmbedding.init({
model: EmbeddingModel.MLE5Large,
maxLength: 512,
cacheDir: "./local_cache",
showDownloadProgress: false,
});
results = await this.embedder.queryEmbed("hello");
// results
{
"0": 0.04752354323863983,
"1": -0.0058815134689211845,
"2": 0.0006205342360772192,
"3": -0.03057778812944889,
"4": 0.038045115768909454,
"5": -0.02664419822394848,
"6": -0.018038146197795868,
"7": 0.03404559567570686,
...
}
(also not related but queryEmbed() doesn't return number[] as TS types says, but structure like above, so I use Object.values() to get number[])
fastembed (Python):
from langchain_community.embeddings import FastEmbedEmbeddings
...
embedding_function = FastEmbedEmbeddings(
model_name="intfloat/multilingual-e5-large",
cache_dir=".cache",
)
results = embedding_function.embed_query("hello")
# results
[
0.6659826636314392,
-0.13676010817289352,
-0.41865870729088783,
-1.7572288811206818,
0.4101845696568489,
-1.075161725282669,
-0.4985295459628105,
0.9173809438943863,
...
]
This results to a lower score when searching through .search() using QdrantClient (@qdrant/js-client-rest). Also, I'm using a micro-service that add documents to Qdrant vector database, which use embed_query() on Python-side. Output would be currently more accurate when embedding query from Python-side (since it would use same function), but I'd like to be free from Python-side and embed user queries on JS-side (unfortunately, rn embedded vectors differ for an exact same query).
export class QdrantService extends QdrantClient implements OnModuleInit {
constructor() {
super({
url: process.env.QDRANT_CLUSTER_URL,
apiKey: process.env.QDRANT_API_KEY,
});
}
...
async searchQdrant(...) {
...
return await this.search(collectionName, {
vector: {
name: vectorName,
vector: embedding,
},
limit: limit,
filter: filter,
with_payload: true,
with_vector: false,
})
}
Python left | JS right
Hey @rklf. Can you try with BAAI/bge-base-en-v1.5?
@Anush008 Just tried now, it happens too.
For query hello :
Python: https://pastebin.com/raw/WHirXXpU
embedding_function = FastEmbedEmbeddings(
model_name="BAAI/bge-base-en-v1.5",
cache_dir=".cache",
)
JS: https://pastebin.com/raw/Ui7kiKkU
this._embedder = await FlagEmbedding.init({
model: EmbeddingModel.BGEBaseENV15,
maxLength: 512,
cacheDir: "./local_cache",
showDownloadProgress: false,
});
await this.embedder.queryEmbed(text)
That's unexpected. We do have tests for canonical values same as the Python implementation.
PYTHON: https://github.com/qdrant/fastembed/blob/b785640bd5081d83f76b1aee633ef2ae5bdd8f3c/tests/test_text_onnx_embeddings.py#L22-L24
NODE: https://github.com/Anush008/fastembed-js/blob/e4e2fdcaa41a60b337a3c7cab514e334f2f718cb/tests/fastembed_bgebase_v15.test.ts#L94
There might be an issue then. Can you try on your own?
@Anush008 Did you get time to have a look into this?
I am OOO currently.