fastembed
fastembed copied to clipboard
[Model Request]: Model2Vec models
What happened?
It would be great if Model2Vec models could be supported, e.g. https://huggingface.co/minishlab/potion-base-8M. They include ONNX files and are extremely fast (since the embeddings are static), small, and high performant.
What Python version are you on? e.g. python --version
Python3.10
Version
0.2.7 (Latest)
What os are you seeing the problem on?
No response
Relevant stack traces and/or logs
No response
Hey @Pringled
Sounds good, do you think you could submit a PR with a couple of the most prosperous models?
Hey @joein , great!
I just had a look and sadly the ONNX models don't work out of the box since they do not have the same expected inputs from the tokenizer. However, ONNX models in general don't add any benefits for these models as they are static embeddings. I was wondering if you are open to the idea of a static_embedding.py option that follows the same structure als the normal TextEmbedding type but uses model2vec instead of ONNX. I think it would be a good fit for fastembed as these embeddings are extremely fast (e.g. ~50x faster than MiniLM and ~500x faster than bge-base), while still having good performance. There are also very few dependencies in the package (numpy is the biggest one).
However, I understand if this is not something you want to introduce. In any case, if you are interested in it, I would be willing to write some integration code and make a PR, let me know!
We would rather prefer to keep it onnx only What inputs does it have? Maybe we could adjust the interface or introduce a separate class
That would be great if that's possible! The inputs are input_ids and offsets (as the forward pass is simple an embeddingbag). These are the forward pass and tokenize functions for the onnx models. So I guess to make it work, the onnx_embed function would have to be altered slightly, or a variant for static embeddings would have to be introduced?
Hello @joein ,
We're still interested in contributing this to the library. Let me know if you'd like a PR to add them. FYI: I think we can just add model2vec as a direct (optional) dependency, because ONNX also doesn't give us additional speedups.
The only real dependencies model2vec has are numpy and tokenizers, the other ones are just for hf integration. Let me know if this is something you're still interested in.
Stéphan