vit-pytorch icon indicating copy to clipboard operation
vit-pytorch copied to clipboard

cls_token

Open shibin2018 opened this issue 4 years ago • 2 comments

All samples in a batch share the same cls_token(because in the code, the cls_token is repeated for batch_size), but how they change to be different during loss backward? As the cls_token was used as the classifier input, then all samples in a batch will be classified as the same label?

shibin2018 avatar Dec 03 '20 09:12 shibin2018

the CLS token is passed through the layers of attention and aggregates information from the rest of the tokens as it makes its way up

lucidrains avatar Dec 03 '20 19:12 lucidrains

I had the same question here and here is my illustration about it. Please remind yourself that cls_token is a parameter, not a feature of input. Actually we can consider it as the starting point to give final label through self-att & mlp as info-aggregating procedures. By comparing the y_hat=f(cls_token, params|input) and true label y, cls_token and other params would be updated to be able to learn the effective way for aggregating infos from input.

Linhengyang avatar Feb 20 '23 12:02 Linhengyang