flair
flair copied to clipboard
Combining BERT and other types of embeddings
A while ago I was upset with the embed merging process and suddenly this amazing library pops up. You are great!
My question: The flair model can give a representation of any word (it can handle the OOV problem), while the BERT model splits the unknown word into several sub-words.
For example, the word "hjik" will have one vector represented in flair, while in BERT it will be divided into several words (because it's OOV) and therefore we will have several vectors for each sub word. So from flair we'll have one vector while from BERT we might have two or more vectors.
The question here is how did you handle this case?
Hello @AliHaiderAhmad001 thanks for using Flair! The TransformerWordEmbeddings
class has default handling for words split into multiple subwords which you control with the subtoken_pooling
parameter (your choices are "first", "last", "first_last" and "mean"), see the info here: https://github.com/flairNLP/flair/blob/master/resources/docs/embeddings/TRANSFORMER_EMBEDDINGS.md#pooling-operation
Thank you very much @alanakbik
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.