About Visualization Tools
Thanks for your great work, which is significant for this area. Could you please provide the visualization tool for Figure 3 ? I am really interested in the visual results of retained image patches and want to try it on my own. Thanks a lot.
lol you mean fig.2? Currently we just picked random samples and tracked attention maps to get these data. You can simply get attention distribution in llava_llama.py and get [CLS] attention distribution of visual encoder in clip_encoder.py. Later we will release the complete visualization tool
Sry, actually I mean fig.4
For fig.4, currently we just filtered those patches with top [CLS] attention scores, and manually marked each object with different color for paper readers to see the effectiveness of pruning with [CLS] attention. We consider introducing some automatic tools like SAM to mark different objects in the near future.
Thanks for your kind reply. Looking forward to the release of your complete visualization tools~