sister
sister copied to clipboard
Use mean of token embeddings not <CLS> embedding for BERT encoder
Is your feature request related to a problem? Please describe. Mean of token embeddings from BERT is known to perform better than <CLS> embedding from BERT. But currently, BERT encoder returns <CLS> embedding vector.
Describe the solution you'd like