Michael Feil
Michael Feil
These two projects : - https://github.com/michaelfeil/infinity (disclaimer: I am maintaining it) - Huggingface/text-embeddings-inference (alternative without deps to torch, api only)
@stephen-youn Did you manage to solve this? Got a similar issue.
@pommedeterresautee FYI, unit tests seem to pass. What do you think about this PR?
@pommedeterresautee friendly reminder!
Yeah, the batching happens with multiple async request at once. This is also used when the batch size is larger than what can fit at once. if there is no...
On my system the code above still vails with v0.1.1 Have you tried the above code? @NirantK For models, i use "sentence-transformers/all-MiniLM-L6-v2" on both sides.
@NirantK sentence-transformers=2.22 fastembed=0.1.1 ```python sentence = ["This is a test sentence."] arrays are not almost equal to 1 decimals Mismatched elements: 2 / 384 (0.521%) Max absolute difference: 0.81547204 Max...
FYI for "BAAI/bge-base-en" i get a cosine_sim of `~0.999`. For "sentence-transformers/all-MiniLM-L6-v2" its around `0.223`
@casper-hansen I saw that the outputs of the model really differ in embedding space. - Do I need to quantize all layers? I saw that all layers are replaced with...
Thanks for the hint, I have not tried out `modules_to_not_convert` - are you refering to this example? https://github.com/casper-hansen/AutoAWQ/blob/29ee66d9e77f3e443d48a17b4838d00a76bc6f5e/examples/mixtral_quant.py#L6 I am trying to directly use Cosine-Similarity between query and paragraph as...