InternVL
InternVL copied to clipboard
--freeze_backbone False?
Why does the file internvl_chat_v1_2_hermes2_yi34b_448_finetune.sh include --freeze_backbone False? Isn't the visual encoder supposed to be frozen during the pre-training phase?
Hello, this is the fine-tuning script. When we fine-tune, we open the entire model and train it.
Hello, this is the fine-tuning script. When we fine-tune, we open the entire model and train it.
May I ask if you do any ablation on this? Does open the vision module bring any notable benefits?
Yes, in my experiments, turning on the vision encoder was significantly better than freezing it, so in all recent experiments, I have turned on the vision encoder during the finetune phase.
For these hyperparameters, which modules are trained and which modules are frozen at each stage, you can find out on our blog: https://internvl.github.io/blog/
Hi @czczup, thank you for the detailed information. However, I noticed that in the latest InternVL2.0 code, such as in this script: internvl2_4b_phi3_3_8b_dynamic_res_2nd_finetune_full.sh, freeze_backbone is set to True.
In my experiments, fine-tuning works well when freeze_backbone is set to True. However, when I set freeze_backbone to False, I encounter errors related to libcudnn_cnn_train.so.8. I don’t believe this is a CUDA or cuDNN issue. Could you clarify if you use different training scripts or settings when fine-tuning the entire model (ViT + MLP + LLM)? Any insights on how to resolve these errors would be greatly appreciated.
Yes, in my experiments, turning on the vision encoder was significantly better than freezing it, so in all recent experiments, I have turned on the vision encoder during the finetune phase.
For these hyperparameters, which modules are trained and which modules are frozen at each stage, you can find out on our blog: https://internvl.github.io/blog/