vivekh2000
vivekh2000
@jacobgil @qiaoyu1002 @rojinakashefi, I think the error "list index out of range" occurs because the code of the vision transformer in the **timm** repo changed. Therefore, the original code of...
That was insightful. However, in your implementation of class DistillMixin, you are passing the `cls` token from the MLP lead. The MLP head of the distillation token is implemented using...
@lucidrains I have also noticed that in the paper, the authors mentioned that at test time, they fused the two heads, i.e., MLP heads for cls token and distilled token....