VTC-CLS About Visualization Tools

Thanks for your great work, which is significant for this area. Could you please provide the visualization tool for Figure 3 ? I am really interested in the visual results of retained image patches and want to try it on my own. Thanks a lot.

Dec 19 '24 11:12 KaKa-101

lol you mean fig.2? Currently we just picked random samples and tracked attention maps to get these data. You can simply get attention distribution in llava_llama.py and get [CLS] attention distribution of visual encoder in clip_encoder.py. Later we will release the complete visualization tool

Dec 22 '24 09:12 ChimpOnCloud

Sry, actually I mean fig.4

Dec 23 '24 02:12 KaKa-101

For fig.4, currently we just filtered those patches with top [CLS] attention scores, and manually marked each object with different color for paper readers to see the effectiveness of pruning with [CLS] attention. We consider introducing some automatic tools like SAM to mark different objects in the near future.

Dec 23 '24 04:12 ChimpOnCloud

Thanks for your kind reply. Looking forward to the release of your complete visualization tools~

Dec 25 '24 16:12 KaKa-101