bert
bert copied to clipboard
How can I make word embedding using Bert?
Hi,
I want to make feature vectors from my documents using Bert. I would like to make a vector for each word in my texts, make the average vectors of my words for each document and add it as one of the features to my classifier. I have read extract_features.py script, but I couldn't get how I can use Bert and make the word embedding and extract features from my text docs. Would you please help me understand what the step by step process is for making this vector representation? Do I need to customize Bert, if yes would you please point me to the files that need to be changed?
Many thanks!
https://github.com/hanxiao/bert-as-service/blob/master/README.md
i think this repo may help you.
Many thanks! I want to make sure I got the idea. So, I have a question regarding Bert word embedding result. I am going to make a list of all my documents and then use bc.encode([doc_1,doc_2,...]). Does Bert make an array of weights for each document and not an individual word?
Thanks again for your help!
@saeideh-sh Here is another way to get word embedding from BERT. Please check it out! https://github.com/imgarylai/bert-embedding
Many thanks! I want to make sure I got the idea. So, I have a question regarding Bert word embedding result. I am going to make a list of all my documents and then use bc.encode([doc_1,doc_2,...]). Does Bert make an array of weights for each document and not an individual word?
Thanks again for your help!
Hi @saeideh-sh - did you ever find out if each word has a unique embedding ? :)
Hi. Have you found any solution on this?
Excuse me does anyone have a solution for it please
Hi, How can I make word embedding using DarijaBert?