fastembed icon indicating copy to clipboard operation
fastembed copied to clipboard

Model Request: intfloat/multilingual-e5-small

Open bm777 opened this issue 1 year ago • 3 comments

What happened?

A bug happened!

Traceback (most recent call last):
  File "~/space/garbage/_fastembed.py", line 8, in <module>
    embedding_model = TextEmbedding(model_name="intfloat/multilingual-e5-base")
  File "~/Library/Python/3.9/lib/python/site-packages/fastembed/text/text_embedding.py", line 77, in __init__
    raise ValueError(
ValueError: Model intfloat/multilingual-e5-base is not supported in TextEmbedding.Please check the supported models using `TextEmbedding.list_supported_models()`

What Python version are you on? e.g. python --version

Python3.9

Version

0.2.7 (Latest)

What os are you seeing the problem on?

MacOS

Relevant stack traces and/or logs

The version of FastEmbed is not updated on this bug page: Version option.
The version I'm using is 0.3.1 (latest).

bm777 avatar Jul 05 '24 05:07 bm777

@NirantK can I open a PR for this model?

bm777 avatar Jul 13 '24 16:07 bm777

@bm777 always open to contributions.

Anush008 avatar Jul 17 '24 12:07 Anush008

As of fastembed 0.6.0 It is possible to add this model in runtime via the following code snippet

from fastembed import TextEmbedding
from fastembed.common.model_description import PoolingType, ModelSource

TextEmbedding.add_custom_model(
    model="intfloat/multilingual-e5-small",
    pooling=PoolingType.MEAN,
    normalization=True,
    sources=ModelSource(hf="intfloat/multilingual-e5-small"),  # can be used with an `url` to load files from a private storage
    dim=384,
    model_file="onnx/model.onnx",  # can be used to load an already supported model with another optimization or quantization, e.g. onnx/model_O4.onnx
)
model = TextEmbedding(model_name="intfloat/multilingual-e5-small")
embeddings = list(model.embed(documents))

joein avatar Mar 02 '25 15:03 joein