SBERT-WK-Sentence-Embedding How can i get the sentence representations from SBERT-WK

How can i get the sentence representations from SBERT-WK

Open boscoj2008 opened this issue 3 years ago • 3 comments

I would like to use the representations for a custom data in downstream clustering. But i don't see how i can obtain the sentence representations using your method. Any help will be appreciated. Thanks in advance..

Apr 16 '21 01:04 boscoj2008

Hi John, You can simply run ./example.sh to see the example for extracting sentence representation for an input sentence. It should be easily edited for your specific task.

Apr 16 '21 02:04 BinWang28

@BinWang28 thank you for responding. I have already tried the example.sh file and run it before. The default setting is to ask for 2 sentences and return a similarity score. However, I have 5000 records or tuples and would not enter this manually for each sentence/ record. Moreover, what I am looking for are the sentence representations themselves, i.e., sentence vectors of my records and not the similarity score. Could you elaborate a process in which the code could be modified? Thanks.

Apr 16 '21 05:04 boscoj2008

Hi John, That's a good starting point if you can make the example code working. You do not need to to manually input all the sentences. What you need to do is (in python scripts):

Read all your 5000 records/sentences
Write a for loop to extract each of the record/sentence embeddings one by one. (simply modification on the example code)
Use the sentence embedding in the your applications

Apr 17 '21 16:04 BinWang28

SBERT-WK-Sentence-Embedding SBERT-WK-Sentence-Embedding copied to clipboard

How can i get the sentence representations from SBERT-WK

SBERT-WK-Sentence-Embedding
SBERT-WK-Sentence-Embedding copied to clipboard