langchain
langchain copied to clipboard
Support for computing embeddings using ONNX
-
Description: This PR introduces the inclusion of
OnnxEmbeddings
in the LangChain library. It allows users to compute embeddings for documents and queries using ONNX models. The change involves the creation of a new fileonnx.py
to house theOnnxEmbeddings
class and the modification of the__init__.py
file to includeOnnxEmbeddings
in the__all__
list, making it accessible as part of the module's public API. -
Issue: N/A (New Feature)
-
Dependencies: This change requires the
onnx
,onnxruntime
,optimum
, andtransformers
packages. Users who wish to utilizeOnnxEmbeddings
should ensure that these packages are installed. If not, they can install them using the following command:
pip install optimum transformers onnxruntime onnx
-
Tag Maintainer: @baskaryan, @eyurtsev, @hwchase17
-
Twitter handle: https://twitter.com/malteH_B
-
LinkedIn handle: https://www.linkedin.com/in/maltehb/
Implementation Details
The OnnxEmbeddings
class inherits from BaseModel
and Embeddings
, and it overrides the embed_documents
and embed_query
methods to compute embeddings using ONNX models. It utilizes the AutoTokenizer
from the transformers
library for tokenization and ORTModelForFeatureExtraction
from the optimum.onnxruntime
package for feature extraction.
Currently, the model loaded from the transformers
hub is just exported to ONNX format in-memory. One could imagine a flag for writing the ONNX model to disk, and then loading it.
The users have the flexibility to pass keyword arguments to the model and during tokenization through model_kwargs
and encode_kwargs
respectively. They can also set a query_instruction
, which will prepend the query text during the embedding of queries. This is, of course, important if using models such as BGE or Instruct.
Testing
Since this is a new feature, it is crucial to include tests for this integration. Tests should focus on the functionality of OnnxEmbeddings
, ensuring that it computes the embeddings correctly and handles edge cases well. Preferably, unit tests that do not rely on network access should be written to maintain the reliability of the test suite.
Examples
An example notebook showcasing the usage of OnnxEmbeddings
should be included in the docs/extras
directory to aid users in understanding how to use this new feature.
Final Notes
This feature is an enhancement for LangChain, providing users with a significantly faster option for computing embeddings, compared to using SentenceTransformers
and other similar libraries.
Checklist
- I have run
make format
andmake lint
locally. - The
make test
command did not work, due to another issue, which should be fixed in another PR.
I would appreciate it if @baskaryan, @eyurtsev, or @hwchase17 could review this PR and provide their valuable feedback.
The latest updates on your projects. Learn more about Vercel for Git ↗︎
Name | Status | Preview | Comments | Updated (UTC) |
---|---|---|---|---|
langchain | ✅ Ready (Inspect) | Visit Preview | 💬 Add feedback | Aug 9, 2024 6:09am |
LlamaIndex
also just got support for this: https://github.com/jerryjliu/llama_index/blob/c81a0761defc2f9aac3ec3009e8bde4b2436d055/llama_index/embeddings/huggingface_optimum.py#L8
Would be great if this could be added to LangChain
! :-)
awesome! could we add a simple notebook to docs/extras/integrations/text_embedding
awesome! could we add a simple notebook to
docs/extras/integrations/text_embedding
Should be done now @baskaryan. Let me know what you think!
@baskaryan any chance this could get reviewed again?
@MalteHB Can you, please, resolve the conflicts? Thanks!
@MalteHB Can you, please, resolve the conflicts? Thanks!
Yes - done!
@MalteHB Thanks! @ccurme Can you, please, review and merge it? Thanks!
Thank you for the PR. This PR is marked Needs Support
and has not yet received the 5 upvotes required by maintainers for review. It has been open for at least 25 days. Per the LangChain review process, this PR will be closed in 5 days if it does not reach the required threshold.
The Needs Support
status is intended to prioritize review time on features that have demonstrated community support. If you feel this status was assigned in error or need more time to gather the required upvotes, please ping (at)ccurme and (at)efriis.