LLaVA-NeXT Any reason for also training vision encoder?

Any reason for also training vision encoder?

Open NicoZenith opened this issue 1 year ago • 0 comments

It seems that the policy for training llava has changed since llava-next. While before it was the tradition to only finetune the connector and the LLM during instruction tuning, now the vision encoder is also trained. Any reason why? Does it increase performance? Or could that hinder it?

Sep 15 '24 19:09 NicoZenith

LLaVA-NeXT LLaVA-NeXT copied to clipboard

Any reason for also training vision encoder?

LLaVA-NeXT
LLaVA-NeXT copied to clipboard