llama_index icon indicating copy to clipboard operation
llama_index copied to clipboard

Google Universal Sentence Encoder Support for Embeddings

Open zestor opened this issue 2 years ago • 3 comments

Added support for embeddings with Google Universal Sentence Encoder v5

zestor avatar Mar 22 '23 20:03 zestor

Added support for embeddings with Google Universal Sentence Encoder v5

thanks!! will take a look soon

jerryjliu avatar Mar 22 '23 20:03 jerryjliu

it's an error. The file should not be modified. Not sure how it was modified.

On Thu, Mar 23, 2023 at 3:08 PM Jerry Liu @.***> wrote:

@.**** commented on this pull request.

thanks for doing this! a few comments:

  • why is scripts/create_llama_package.sh modified?
  • there's some linter issues, i can help fix if you'd like

In gpt_index/embeddings/google_use.py https://github.com/jerryjliu/llama_index/pull/842#discussion_r1146700830 :

@@ -0,0 +1,28 @@ +"""Google Universal Sentence Encoder Embedding Wrapper Module."""

+from typing import List + +import tensorflow_hub as hub + +from gpt_index.embeddings.base import BaseEmbedding + +# Google Universal Sentence Encode v5 +google_use = hub.load("https://tfhub.dev/google/universal-sentence-encoder-large/5")

could this be a part of the init file? so it's lazily downloaded. Since I'd love to expose this module in gpt_index/embeddings/init.py (so users can do from gpt_index.embeddings import GoogleUnivSentEncoderEmbeddings)

— Reply to this email directly, view it on GitHub https://github.com/jerryjliu/llama_index/pull/842#pullrequestreview-1355381634, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARB4DGE6XAGHB4F2SHOL2DW5SNTNANCNFSM6AAAAAAWEKTD24 . You are receiving this because you authored the thread.Message ID: @.***>

-- Chris Clark 704.776.0862

zestor avatar Mar 23 '23 19:03 zestor

it's an error. The file should not be modified. Not sure how it was modified. On Thu, Mar 23, 2023 at 3:08 PM Jerry Liu @.> wrote: @.* commented on this pull request. thanks for doing this! a few comments: - why is scripts/create_llama_package.sh modified? - there's some linter issues, i can help fix if you'd like ------------------------------ In gpt_index/embeddings/google_use.py <#842 (comment)> : > @@ -0,0 +1,28 @@ +"""Google Universal Sentence Encoder Embedding Wrapper Module.""" + +from typing import List + +import tensorflow_hub as hub + +from gpt_index.embeddings.base import BaseEmbedding + +# Google Universal Sentence Encode v5 +google_use = hub.load("https://tfhub.dev/google/universal-sentence-encoder-large/5") could this be a part of the init file? so it's lazily downloaded. Since I'd love to expose this module in gpt_index/embeddings/init.py (so users can do from gpt_index.embeddings import GoogleUnivSentEncoderEmbeddings) — Reply to this email directly, view it on GitHub <#842 (review)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARB4DGE6XAGHB4F2SHOL2DW5SNTNANCNFSM6AAAAAAWEKTD24 . You are receiving this because you authored the thread.Message ID: @.***> -- Chris Clark 704.776.0862

I am struggling to get my existing fork to have the correct files with only 1 file changed, feel free to take the code and incorporate.

zestor avatar Mar 25 '23 16:03 zestor

Re-opening as https://github.com/jerryjliu/llama_index/pull/1755

Disiok avatar May 01 '23 17:05 Disiok