DeltaEdit About Figure 2 in the paper

About Figure 2 in the paper

Open KN1GHT9 opened this issue 2 years ago • 2 comments

Congratulations on your paper being accepted by CVPR 2023！ Regarding Figure 2 in your paper,

can you provide the implementation code?
I tried hard to reproduce this result but failed. Can you elaborate on how the CLIP Delta feature is calculated here?

Jun 15 '23 15:06 KN1GHT9

Hello, have you solved this problem? I also encountered this problem, in the graph I generated, I represented the image points in red and the caption points in blue, I used ViT-L/14 clip and the MSCOCO dataset, but the results were very different from what I expected

Mar 30 '24 13:03 gWeiXP

Hello, have you solved this problem? I also encountered this problem, in the graph I generated, I represented the image points in red and the caption points in blue, I used ViT-L/14 clip and the MSCOCO dataset, but the results were very different from what I expected

I understand. I need to concatenate image_embeddings and caption_embeddings first, and then input them to TSNE together instead of separately

Mar 30 '24 14:03 gWeiXP

We have just uploaded the t-SNE implementation code, which you are welcome to use directly. You can also revalidate the results using your own code, but please note that the Delta features need to be normalized after obtaining the differences. In our experiments, we used the ViT-L/32 model, but we believe the ViT-L/14 model should yield similar results.

Jun 22 '25 16:06 Yueming6568

DeltaEdit DeltaEdit copied to clipboard

About Figure 2 in the paper

DeltaEdit
DeltaEdit copied to clipboard