IHikari
Results
2
comments of
IHikari
In the original CLIP architecture, the final features are obtained through the projection of the EOS and CLS tokens. If the read-only mask can prevent the EOS or CLS tokens...
Got it! Thank you for the author's patient reply and your work.