fastembed icon indicating copy to clipboard operation
fastembed copied to clipboard

[Model Request]: Model2Vec models

Open Pringled opened this issue 1 year ago • 5 comments

What happened?

It would be great if Model2Vec models could be supported, e.g. https://huggingface.co/minishlab/potion-base-8M. They include ONNX files and are extremely fast (since the embeddings are static), small, and high performant.

What Python version are you on? e.g. python --version

Python3.10

Version

0.2.7 (Latest)

What os are you seeing the problem on?

No response

Relevant stack traces and/or logs

No response

Pringled avatar Nov 07 '24 09:11 Pringled

Hey @Pringled

Sounds good, do you think you could submit a PR with a couple of the most prosperous models?

joein avatar Nov 07 '24 12:11 joein

Hey @joein , great!

I just had a look and sadly the ONNX models don't work out of the box since they do not have the same expected inputs from the tokenizer. However, ONNX models in general don't add any benefits for these models as they are static embeddings. I was wondering if you are open to the idea of a static_embedding.py option that follows the same structure als the normal TextEmbedding type but uses model2vec instead of ONNX. I think it would be a good fit for fastembed as these embeddings are extremely fast (e.g. ~50x faster than MiniLM and ~500x faster than bge-base), while still having good performance. There are also very few dependencies in the package (numpy is the biggest one).

However, I understand if this is not something you want to introduce. In any case, if you are interested in it, I would be willing to write some integration code and make a PR, let me know!

Pringled avatar Nov 07 '24 13:11 Pringled

We would rather prefer to keep it onnx only What inputs does it have? Maybe we could adjust the interface or introduce a separate class

joein avatar Nov 07 '24 19:11 joein

That would be great if that's possible! The inputs are input_ids and offsets (as the forward pass is simple an embeddingbag). These are the forward pass and tokenize functions for the onnx models. So I guess to make it work, the onnx_embed function would have to be altered slightly, or a variant for static embeddings would have to be introduced?

Pringled avatar Nov 07 '24 21:11 Pringled

Hello @joein ,

We're still interested in contributing this to the library. Let me know if you'd like a PR to add them. FYI: I think we can just add model2vec as a direct (optional) dependency, because ONNX also doesn't give us additional speedups.

The only real dependencies model2vec has are numpy and tokenizers, the other ones are just for hf integration. Let me know if this is something you're still interested in.

Stéphan

stephantul avatar Jul 14 '25 10:07 stephantul