VTC-CLS icon indicating copy to clipboard operation
VTC-CLS copied to clipboard

About Visualization Tools

Open KaKa-101 opened this issue 1 year ago • 4 comments

Thanks for your great work, which is significant for this area. Could you please provide the visualization tool for Figure 3 ? I am really interested in the visual results of retained image patches and want to try it on my own. Thanks a lot.

KaKa-101 avatar Dec 19 '24 11:12 KaKa-101

lol you mean fig.2? Currently we just picked random samples and tracked attention maps to get these data. You can simply get attention distribution in llava_llama.py and get [CLS] attention distribution of visual encoder in clip_encoder.py. Later we will release the complete visualization tool

ChimpOnCloud avatar Dec 22 '24 09:12 ChimpOnCloud

Sry, actually I mean fig.4

KaKa-101 avatar Dec 23 '24 02:12 KaKa-101

For fig.4, currently we just filtered those patches with top [CLS] attention scores, and manually marked each object with different color for paper readers to see the effectiveness of pruning with [CLS] attention. We consider introducing some automatic tools like SAM to mark different objects in the near future.

ChimpOnCloud avatar Dec 23 '24 04:12 ChimpOnCloud

Thanks for your kind reply. Looking forward to the release of your complete visualization tools~

KaKa-101 avatar Dec 25 '24 16:12 KaKa-101