LLaVA-NeXT icon indicating copy to clipboard operation
LLaVA-NeXT copied to clipboard

Any reason for also training vision encoder?

Open NicoZenith opened this issue 1 year ago • 0 comments

It seems that the policy for training llava has changed since llava-next. While before it was the tradition to only finetune the connector and the LLM during instruction tuning, now the vision encoder is also trained. Any reason why? Does it increase performance? Or could that hinder it?

NicoZenith avatar Sep 15 '24 19:09 NicoZenith