paper-qa icon indicating copy to clipboard operation
paper-qa copied to clipboard

Setting up SentenceTransformerEmbeddingModel for GPU use.

Open Snikch63200 opened this issue 1 year ago • 3 comments

Hello,

I finally managed to use a local embedding model (mxbai-embed-large-v1) using new SentenceTransformerEmbeddingModel class (thenks to developper team for this work !!! ;-)).

sparse_embedding_model = SparseEmbeddingModel(ndim=4096)
sentence_transformer = SentenceTransformerEmbeddingModel(name="mixedbread-ai/mxbai-embed-large-v1",
                                                         )
hybrid_model=HybridEmbeddingModel(models=[sparse_embedding_model,sentence_transformer])

It works fine but it's very slow because doesn't use GPU... Is it a way to force model to use GPU (in config options I guess...).

Thanks for help.

@Dosu.

Snikch63200 avatar Oct 25 '24 06:10 Snikch63200

To configure the SentenceTransformerEmbeddingModel to use the GPU for the mxbai-embed-large-v1 model, you need to set the device parameter in the config dictionary to "cuda". Here's how you can modify your code:

sparse_embedding_model = SparseEmbeddingModel(ndim=4096)
sentence_transformer = SentenceTransformerEmbeddingModel(
    name="mixedbread-ai/mxbai-embed-large-v1",
    config={"device": "cuda"}  # Set the device to GPU
)
hybrid_model = HybridEmbeddingModel(models=[sparse_embedding_model, sentence_transformer])

This configuration will ensure that the model uses the GPU for computations [1].

To continue talking to Dosu, mention @dosu.

dosubot[bot] avatar Oct 25 '24 06:10 dosubot[bot]

Works fine !! Thanks.

GPU is MUCH faster for this task !

Where could I find a complet lis of available options for 'config' dict (config={"device": "cuda"})

@Dosu.

Snikch63200 avatar Oct 25 '24 07:10 Snikch63200

The available configuration options for the SentenceTransformerEmbeddingModel in the codebase are:

  • batch_size: Specifies the batch size for processing the texts. The default value is 32.
  • device: Specifies the device on which the model should run, such as "cpu" or "cuda". The default value is "cpu" [1][2].

To continue talking to Dosu, mention @dosu.

dosubot[bot] avatar Oct 25 '24 07:10 dosubot[bot]