fastembed icon indicating copy to clipboard operation
fastembed copied to clipboard

[Model Request] FastEmbed new model request: dunzhang/stella_en_1.5B_v5

Open richard-deetlefs opened this issue 1 year ago • 5 comments

Good day,

Would you please add dunzhang/stella_en_1.5B_v5 to FastEmbed? It is not thaaat big (5.75 GB), but has an excellent MTEB retrieval score and MIT license.

I am also a paid qdrant cloud user if that helps with the argument ;)

Thanks!

richard-deetlefs avatar Nov 21 '24 16:11 richard-deetlefs

@joein Any possibility of getting this one added?

PylotLight avatar Jan 28 '25 08:01 PylotLight

Hey @richard-deetlefs @PylotLight Sorry for the late response

We might not be able to add this model in the closest time, however, as of fastembed v0.6.0 we support adding custom models to fastembed in runtime

As I can see, the model providers have converted it to onnx, so should not be a problem, unless the model follows the typical preprocessing/postprocessing (just pooling / normalization) steps

An example of adding a custom model

In this particular case, you would also need to set additional_files=["onnx/model.onnx_data"]

joein avatar Mar 02 '25 15:03 joein

Got it. Thanks.

PylotLight avatar Mar 02 '25 21:03 PylotLight

Great! Thanks!

Do you have plans to do the same for Sparse text embeddings, Late interaction model, and the others?

richard-deetlefs avatar Mar 03 '25 05:03 richard-deetlefs

Eventually, yes, we'd like to add this feature to the other models as well

However, it might not be reasonable for some of the models because they have too specific steps in preprocessing/postprocessing which are impossible to reuse

Image models are the most probable candidates to get the custom models support feature next

joein avatar Mar 03 '25 09:03 joein