langchain Support for computing embeddings using ONNX

Description: This PR introduces the inclusion of OnnxEmbeddings in the LangChain library. It allows users to compute embeddings for documents and queries using ONNX models. The change involves the creation of a new file onnx.py to house the OnnxEmbeddings class and the modification of the __init__.py file to include OnnxEmbeddings in the __all__ list, making it accessible as part of the module's public API.
Issue: N/A (New Feature)
Dependencies: This change requires the onnx, onnxruntime, optimum, and transformers packages. Users who wish to utilize OnnxEmbeddings should ensure that these packages are installed. If not, they can install them using the following command:

pip install optimum transformers onnxruntime onnx

Tag Maintainer: @baskaryan, @eyurtsev, @hwchase17
Twitter handle: https://twitter.com/malteH_B
LinkedIn handle: https://www.linkedin.com/in/maltehb/

Implementation Details

The OnnxEmbeddings class inherits from BaseModel and Embeddings, and it overrides the embed_documents and embed_query methods to compute embeddings using ONNX models. It utilizes the AutoTokenizer from the transformers library for tokenization and ORTModelForFeatureExtraction from the optimum.onnxruntime package for feature extraction.

Currently, the model loaded from the transformers hub is just exported to ONNX format in-memory. One could imagine a flag for writing the ONNX model to disk, and then loading it.

The users have the flexibility to pass keyword arguments to the model and during tokenization through model_kwargs and encode_kwargs respectively. They can also set a query_instruction, which will prepend the query text during the embedding of queries. This is, of course, important if using models such as BGE or Instruct.

Testing

Since this is a new feature, it is crucial to include tests for this integration. Tests should focus on the functionality of OnnxEmbeddings, ensuring that it computes the embeddings correctly and handles edge cases well. Preferably, unit tests that do not rely on network access should be written to maintain the reliability of the test suite.

Examples

An example notebook showcasing the usage of OnnxEmbeddings should be included in the docs/extras directory to aid users in understanding how to use this new feature.

Final Notes

This feature is an enhancement for LangChain, providing users with a significantly faster option for computing embeddings, compared to using SentenceTransformers and other similar libraries.

Checklist

I have run make format and make lint locally.
The make test command did not work, due to another issue, which should be fixed in another PR.

I would appreciate it if @baskaryan, @eyurtsev, or @hwchase17 could review this PR and provide their valuable feedback.

Sep 23 '23 13:09 MalteHB

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
langchain	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Aug 9, 2024 6:09am

Sep 23 '23 13:09 vercel[bot]

LlamaIndex also just got support for this: https://github.com/jerryjliu/llama_index/blob/c81a0761defc2f9aac3ec3009e8bde4b2436d055/llama_index/embeddings/huggingface_optimum.py#L8

Would be great if this could be added to LangChain! :-)

Sep 25 '23 06:09 MalteHB

awesome! could we add a simple notebook to docs/extras/integrations/text_embedding

Sep 29 '23 02:09 baskaryan

awesome! could we add a simple notebook to docs/extras/integrations/text_embedding

Should be done now @baskaryan. Let me know what you think!

Sep 29 '23 21:09 MalteHB

@baskaryan any chance this could get reviewed again?

Oct 09 '23 06:10 MalteHB

@MalteHB Can you, please, resolve the conflicts? Thanks!

Aug 09 '24 01:08 leo-gan

@MalteHB Can you, please, resolve the conflicts? Thanks!

Yes - done!

Aug 09 '24 06:08 MalteHB

@MalteHB Thanks! @ccurme Can you, please, review and merge it? Thanks!

Aug 09 '24 15:08 leo-gan

Thank you for the PR. This PR is marked Needs Support and has not yet received the 5 upvotes required by maintainers for review. It has been open for at least 25 days. Per the LangChain review process, this PR will be closed in 5 days if it does not reach the required threshold.

The Needs Support status is intended to prioritize review time on features that have demonstrated community support. If you feel this status was assigned in error or need more time to gather the required upvotes, please ping (at)ccurme and (at)efriis.

Nov 04 '24 19:11 langcarl[bot]

langchain langchain copied to clipboard

Support for computing embeddings using ONNX

Implementation Details

Testing

Examples

Final Notes

Checklist

langchain
langchain copied to clipboard