flair Combining BERT and other types of embeddings

Combining BERT and other types of embeddings

Open AliHaiderAhmad001 opened this issue 2 years ago • 3 comments

A while ago I was upset with the embed merging process and suddenly this amazing library pops up. You are great!

My question: The flair model can give a representation of any word (it can handle the OOV problem), while the BERT model splits the unknown word into several sub-words.

For example, the word "hjik" will have one vector represented in flair, while in BERT it will be divided into several words (because it's OOV) and therefore we will have several vectors for each sub word. So from flair we'll have one vector while from BERT we might have two or more vectors.

The question here is how did you handle this case?

May 19 '22 15:05 AliHaiderAhmad001

Hello @AliHaiderAhmad001 thanks for using Flair! The TransformerWordEmbeddings class has default handling for words split into multiple subwords which you control with the subtoken_pooling parameter (your choices are "first", "last", "first_last" and "mean"), see the info here: https://github.com/flairNLP/flair/blob/master/resources/docs/embeddings/TRANSFORMER_EMBEDDINGS.md#pooling-operation

May 20 '22 03:05 alanakbik

Thank you very much @alanakbik

May 20 '22 13:05 AliHaiderAhmad001

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Sep 20 '22 22:09 stale[bot]

flair flair copied to clipboard

Combining BERT and other types of embeddings

A while ago I was upset with the embed merging process and suddenly this amazing library pops up. You are great!

flair
flair copied to clipboard