AttackVLM icon indicating copy to clipboard operation
AttackVLM copied to clipboard

What models are used for img2prompt and LLAVA

Open kz29 opened this issue 1 year ago • 1 comments

Hello, Thank you for the provided code. I was reading the paper and checking the GitHub as well but there are no implementation details regarding img2prompt and LLAVA models. Can you please elaborate more and share the details of how can we reproduce these models? Thank you in advance.

kz29 avatar Feb 24 '24 09:02 kz29

bro,do u solve it?

sftsgly avatar Jul 10 '24 16:07 sftsgly

bro, solved it?

yukodada avatar Oct 17 '24 06:10 yukodada

I check the lavis and find that img2prompt only have base type now , so I use this model type and code some function by myself to sove it.

sftsgly avatar Oct 17 '24 13:10 sftsgly

Thanks for the interest.

  • For llava, please refer to the official installation: https://github.com/haotian-liu/LLaVA?tab=readme-ov-file#install
  • For img2prompt, please refer to the base (standard) type implementation of their code: https://github.com/salesforce/LAVIS/blob/main/lavis/models/img2prompt_models/img2prompt_vqa.py

yunqing-me avatar Nov 24 '24 15:11 yunqing-me

thanks

sftsgly avatar Nov 24 '24 15:11 sftsgly