DeltaEdit icon indicating copy to clipboard operation
DeltaEdit copied to clipboard

About Figure 2 in the paper

Open KN1GHT9 opened this issue 2 years ago • 2 comments

Congratulations on your paper being accepted by CVPR 2023! Regarding Figure 2 in your paper,

  1. can you provide the implementation code?
  2. I tried hard to reproduce this result but failed. Can you elaborate on how the CLIP Delta feature is calculated here?

KN1GHT9 avatar Jun 15 '23 15:06 KN1GHT9

Hello, have you solved this problem? I also encountered this problem, in the graph I generated, I represented the image points in red and the caption points in blue, I used ViT-L/14 clip and the MSCOCO dataset, but the results were very different from what I expected image

gWeiXP avatar Mar 30 '24 13:03 gWeiXP

Hello, have you solved this problem? I also encountered this problem, in the graph I generated, I represented the image points in red and the caption points in blue, I used ViT-L/14 clip and the MSCOCO dataset, but the results were very different from what I expected image

I understand. I need to concatenate image_embeddings and caption_embeddings first, and then input them to TSNE together instead of separately image

gWeiXP avatar Mar 30 '24 14:03 gWeiXP

We have just uploaded the t-SNE implementation code, which you are welcome to use directly. You can also revalidate the results using your own code, but please note that the Delta features need to be normalized after obtaining the differences. In our experiments, we used the ViT-L/32 model, but we believe the ViT-L/14 model should yield similar results.

Yueming6568 avatar Jun 22 '25 16:06 Yueming6568