rtfgithub issues

Results 2 issues of


                                            rtfgithub

在代码中，第一阶段的训练中image encoder是冻结的，可学习的text tokens和和text encoder是可学习的。这和论文里描述的只有text tokens是可学习的，image encoder和text encoder是冻结的不匹配呀。

Why apply triplet loss to img_feature_last? here, img_feature_last is the output of the second-to-last module of the ViT model.