Xinhao Li

Results 4 issues of Xinhao Li

The reason why the parameters of BIKE are smaller than the original CLIP ViT-L/14 is that in the BIKE model, we only utilize the vision encoder from CLIP and do...

![image](https://github.com/AILab-CVC/SEED/assets/48858574/054190a8-de85-4e54-b071-81e1e64c2f50) I have obtained similar results using checkpoint test provided by you, but I only obtained about 46 checkpoints using [BLIP-2 checkpoints](https://storage.googleapis.com/sfr-vision-language-research/LAVIS/models/BLIP2/blip2_pretrained.pth). May I ask what might be the problem?