fastembed icon indicating copy to clipboard operation
fastembed copied to clipboard

[Bug/Model Request]: Is slower than sentence transformer for all-minilm-l6-v2

Open 0110G opened this issue 7 months ago • 10 comments

What happened?

On benchmarking synchronous computation times for generating embeddings for

  1. Using sentence transformers: ~1300 msgs per sec
    from sentence_transformers import SentenceTransformer
    model_standard = SentenceTransformer("all-MiniLM-L6-v2")

    start_time = time.time()
    for i in range(iter_count):
        model_standard.encode(random.sample(sentences, 1)[0])
    time_standard = time.time() - start_time
    print("Standard requires: {}s".format(time_standard))
    print("{} processed per sec".format(batch_size*iter_count/time_standard))

VS

  1. Using FastEmbed (Synchronously): 800 msgs per sec
    fast_model = TextEmbedding(model_name="sentence-transformers/all-MiniLM-L6-v2")
    start_time = time.time()
    for i in range(iter_count):
        list(fast_model.embed(random.sample(sentences, 1)[0]))
    time_standard = time.time() - start_time
    print("Fast requires: {}s".format(time_standard))
    print("{} processed per sec".format(batch_size*iter_count/time_standard))

I am using fastembed 0.3.3

pip show fastembed
Name: fastembed
Version: 0.3.3
Summary: Fast, light, accurate library built for retrieval embedding generation
Home-page: https://github.com/qdrant/fastembed
Author: Qdrant Team
Author-email: [email protected]
License: Apache License
Location: /Users/<>/PycharmProjects/Voyager/venv/lib/python3.9/site-packages
Requires: tqdm, PyStemmer, numpy, mmh3, onnxruntime, pillow, onnx, loguru, tokenizers, huggingface-hub, snowballstemmer, requests
Required-by: 

Why is this working so slow wrt original impl.? What can I do to improve performance ?

What Python version are you on? e.g. python --version

3.9.16

Version

0.2.7 (Latest)

What os are you seeing the problem on?

MacOS

Relevant stack traces and/or logs

No response

0110G avatar Jul 09 '24 13:07 0110G