langchain icon indicating copy to clipboard operation
langchain copied to clipboard

Support for computing embeddings using ONNX

Open MalteHB opened this issue 1 year ago • 5 comments

  • Description: This PR introduces the inclusion of OnnxEmbeddings in the LangChain library. It allows users to compute embeddings for documents and queries using ONNX models. The change involves the creation of a new file onnx.py to house the OnnxEmbeddings class and the modification of the __init__.py file to include OnnxEmbeddings in the __all__ list, making it accessible as part of the module's public API.

  • Issue: N/A (New Feature)

  • Dependencies: This change requires the onnx, onnxruntime, optimum, and transformers packages. Users who wish to utilize OnnxEmbeddings should ensure that these packages are installed. If not, they can install them using the following command:

pip install optimum transformers onnxruntime onnx
  • Tag Maintainer: @baskaryan, @eyurtsev, @hwchase17

  • Twitter handle: https://twitter.com/malteH_B

  • LinkedIn handle: https://www.linkedin.com/in/maltehb/

Implementation Details

The OnnxEmbeddings class inherits from BaseModel and Embeddings, and it overrides the embed_documents and embed_query methods to compute embeddings using ONNX models. It utilizes the AutoTokenizer from the transformers library for tokenization and ORTModelForFeatureExtraction from the optimum.onnxruntime package for feature extraction.

Currently, the model loaded from the transformers hub is just exported to ONNX format in-memory. One could imagine a flag for writing the ONNX model to disk, and then loading it.

The users have the flexibility to pass keyword arguments to the model and during tokenization through model_kwargs and encode_kwargs respectively. They can also set a query_instruction, which will prepend the query text during the embedding of queries. This is, of course, important if using models such as BGE or Instruct.

Testing

Since this is a new feature, it is crucial to include tests for this integration. Tests should focus on the functionality of OnnxEmbeddings, ensuring that it computes the embeddings correctly and handles edge cases well. Preferably, unit tests that do not rely on network access should be written to maintain the reliability of the test suite.

Examples

An example notebook showcasing the usage of OnnxEmbeddings should be included in the docs/extras directory to aid users in understanding how to use this new feature.

Final Notes

This feature is an enhancement for LangChain, providing users with a significantly faster option for computing embeddings, compared to using SentenceTransformers and other similar libraries.

Checklist

  • I have run make format and make lint locally.
  • The make test command did not work, due to another issue, which should be fixed in another PR.

I would appreciate it if @baskaryan, @eyurtsev, or @hwchase17 could review this PR and provide their valuable feedback.

MalteHB avatar Sep 23 '23 13:09 MalteHB

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchain ✅ Ready (Inspect) Visit Preview 💬 Add feedback Aug 9, 2024 6:09am

vercel[bot] avatar Sep 23 '23 13:09 vercel[bot]

LlamaIndex also just got support for this: https://github.com/jerryjliu/llama_index/blob/c81a0761defc2f9aac3ec3009e8bde4b2436d055/llama_index/embeddings/huggingface_optimum.py#L8

Would be great if this could be added to LangChain! :-)

MalteHB avatar Sep 25 '23 06:09 MalteHB

awesome! could we add a simple notebook to docs/extras/integrations/text_embedding

baskaryan avatar Sep 29 '23 02:09 baskaryan

awesome! could we add a simple notebook to docs/extras/integrations/text_embedding

Should be done now @baskaryan. Let me know what you think!

MalteHB avatar Sep 29 '23 21:09 MalteHB

@baskaryan any chance this could get reviewed again?

MalteHB avatar Oct 09 '23 06:10 MalteHB

@MalteHB Can you, please, resolve the conflicts? Thanks!

leo-gan avatar Aug 09 '24 01:08 leo-gan

@MalteHB Can you, please, resolve the conflicts? Thanks!

Yes - done!

MalteHB avatar Aug 09 '24 06:08 MalteHB

@MalteHB Thanks! @ccurme Can you, please, review and merge it? Thanks!

leo-gan avatar Aug 09 '24 15:08 leo-gan

Thank you for the PR. This PR is marked Needs Support and has not yet received the 5 upvotes required by maintainers for review. It has been open for at least 25 days. Per the LangChain review process, this PR will be closed in 5 days if it does not reach the required threshold.

The Needs Support status is intended to prioritize review time on features that have demonstrated community support. If you feel this status was assigned in error or need more time to gather the required upvotes, please ping (at)ccurme and (at)efriis.

langcarl[bot] avatar Nov 04 '24 19:11 langcarl[bot]