LoveSJ
LoveSJ
As the BERT is mainly focused on generating the embedding for the input sequence, so you can just put the question to the model and then get its embedding.
exactly! As the main focus of BERT is to pre-train the model, the model is essentially sentence-level. However, you can build your own model upon BERT to perform your task-specific...
>  > Could you explain why you add self.vocab_size between question id and answer id? The self.vocab_size is just a padding symbol to separate the question...
Definitely!! Different word2id would project the same word to a different id. So you should use my word2id.obj. BTW, 57777 words is not very small as we use the sentencepiece...
Oh, that so bad, if you have your own vocab, this application may not suitable for you. Nevertheless, you can use my codes to train your own BERT.