bert icon indicating copy to clipboard operation
bert copied to clipboard

How can I make word embedding using Bert?

Open saeideh-sh opened this issue 6 years ago • 7 comments

Hi,

I want to make feature vectors from my documents using Bert. I would like to make a vector for each word in my texts, make the average vectors of my words for each document and add it as one of the features to my classifier. I have read extract_features.py script, but I couldn't get how I can use Bert and make the word embedding and extract features from my text docs. Would you please help me understand what the step by step process is for making this vector representation? Do I need to customize Bert, if yes would you please point me to the files that need to be changed?

Many thanks!

saeideh-sh avatar Jan 11 '19 00:01 saeideh-sh

https://github.com/hanxiao/bert-as-service/blob/master/README.md

i think this repo may help you.

dsindex avatar Jan 11 '19 01:01 dsindex

Many thanks! I want to make sure I got the idea. So, I have a question regarding Bert word embedding result. I am going to make a list of all my documents and then use bc.encode([doc_1,doc_2,...]). Does Bert make an array of weights for each document and not an individual word?

Thanks again for your help!

saeideh-sh avatar Jan 11 '19 17:01 saeideh-sh

@saeideh-sh Here is another way to get word embedding from BERT. Please check it out! https://github.com/imgarylai/bert-embedding

imgarylai avatar Feb 10 '19 19:02 imgarylai

Many thanks! I want to make sure I got the idea. So, I have a question regarding Bert word embedding result. I am going to make a list of all my documents and then use bc.encode([doc_1,doc_2,...]). Does Bert make an array of weights for each document and not an individual word?

Thanks again for your help!

Hi @saeideh-sh - did you ever find out if each word has a unique embedding ? :)

annikamarie avatar Jun 24 '19 13:06 annikamarie

Hi. Have you found any solution on this?

marianalucut7 avatar Sep 03 '19 20:09 marianalucut7

Excuse me does anyone have a solution for it please

mathshangw avatar Jan 09 '22 08:01 mathshangw

Hi, How can I make word embedding using DarijaBert?

Ismail-Ifakir avatar May 10 '24 18:05 Ismail-Ifakir