LLaVA icon indicating copy to clipboard operation
LLaVA copied to clipboard

Finetuning vision encoder part

Open PhanTask opened this issue 2 years ago • 4 comments

feature

Hi, I wonder in the current code if it is possible to finetune both vision encoder part and the projector? Thanks.

PhanTask avatar Aug 02 '23 20:08 PhanTask

I found in the loader file that we have self.vision_tower.requires_grad_(False). Should I just comment this out?

PhanTask avatar Aug 02 '23 20:08 PhanTask

Have you tried to fine-tune vision model?

gyupro avatar Oct 18 '23 06:10 gyupro

I am also kind of stuck on how to properly fine-tune the CLIP vision encoder, or even subbing it out with something else. Are you still working on this task? Could you please share some updates?

Lord-of-Bugs avatar Feb 21 '24 10:02 Lord-of-Bugs