Joshua Almonte comments

Repositories
Issues
Comments

Results 1 comments of


                                            Joshua Almonte

Why does the embedding generated by ESMC have two more tokens than the sequence length?

Hi 595, I believe the reason lies in the fact that ESM C was trained using a BERT-like transformer architecture. In BERT-like models, a beginning of sequence token `[cls]` is...