[Feature Request]: Does tensorflow text supported ?

Open RockNHawk opened this issue 2 years ago • 1 comments

Background and Feature Description

The universal-sentence-encoder model can generate text embeddings, and it depends on TensorFlow Text. Is TensorFlow Text supported?

https://tfhub.dev/google/universal-sentence-encoder-multilingual-large/3

API Definition and Usage


  import tensorflow_hub as hub
  import numpy as np
  import tensorflow_text
  
  # Some texts of different lengths.
  english_sentences = ["dog", "Puppies are nice.", "I enjoy taking long walks along the beach with my dog."]
  italian_sentences = ["cane", "I cuccioli sono carini.", "Mi piace fare lunghe passeggiate lungo la spiaggia con il mio cane."]
  japanese_sentences = ["犬", "子犬はいいです", "私は犬と一緒にビーチを散歩するのが好きです"]
  chinese_sentences = ["狗"，"小狗很好，我喜欢和我的狗一起沿着海滩散步"]

  embed = hub.load("https://tfhub.dev/google/universal-sentence-encoder-multilingual-large/3")
  
  # Compute embeddings.
  en_result = embed(english_sentences)
  it_result = embed(italian_sentences)
  ja_result = embed(japanese_sentences)
  
  # Compute similarity matrix. Higher score indicates greater similarity.
  similarity_matrix_it = np.inner(en_result, it_result)
  similarity_matrix_ja = np.inner(en_result, ja_result)

Alternatives

No response

Risks

No response

Jun 05 '23 08:06 RockNHawk

Tensorflow.text is not supported now and will be added before v1.0.0, about in 2 months. Currently LLamaSharp is an alternative, which supports using LLM to get embeddings.

Jun 05 '23 09:06 SanftMonster