Fine-tune InternVL2 using SFTTrainer from hugging face
Is the fine-tuning of InternVL supported by hugging face SFTTrainer?
I got the following error when using the SFTTrainer:
model = AutoModel.from_pretrained(
"OpenGVLab/InternVL2-8B",
device_map="auto",
torch_dtype=torch.float16,
trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(
"OpenGVLab/InternVL2-8B",
trust_remote_code=True
)
trainer = SFTTrainer(
model=model,
args=training_arguments,
train_dataset=train_dataset,
dataset_text_field="text", # dummy field
data_collator=collate_fn,
max_seq_length=args.max_seq_length,
tokenizer=tokenizer,
peft_config=peft_config,
packing=args.packing,
dataset_kwargs={"skip_prepare_dataset": True},
)
The issue is probably related to model.language_model.get_input_embeddings() from this issue, but I'm not sure how to fix this here.
Traceback (most recent call last):
File "/home/lytang/MMCheck/sft.py", line 306, in <module>
trainer = SFTTrainer(
File "/data/users/lytang/miniconda3/envs/MM/lib/python3.10/site-packages/huggingface_hub/utils/_deprecation.py", line 101, in inner_f
return f(*args, **kwargs)
File "/data/users/lytang/miniconda3/envs/MM/lib/python3.10/site-packages/trl/trainer/sft_trainer.py", line 268, in __init__
model = get_peft_model(model, peft_config)
File "/data/users/lytang/miniconda3/envs/MM/lib/python3.10/site-packages/peft/mapping.py", line 179, in get_peft_model
return PeftModel(model, peft_config, adapter_name=adapter_name, autocast_adapter_dtype=autocast_adapter_dtype)
File "/data/users/lytang/miniconda3/envs/MM/lib/python3.10/site-packages/peft/peft_model.py", line 164, in __init__
model = self._prepare_model_for_gradient_checkpointing(model)
File "/data/users/lytang/miniconda3/envs/MM/lib/python3.10/site-packages/peft/peft_model.py", line 613, in _prepare_model_for_gradient_checkpointing
model.enable_input_require_grads()
File "/data/users/lytang/miniconda3/envs/MM/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1761, in enable_input_require_grads
self._require_grads_hook = self.get_input_embeddings().register_forward_hook(make_inputs_require_grads)
File "/data/users/lytang/miniconda3/envs/MM/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1780, in get_input_embeddings
raise NotImplementedError
NotImplementedError
Hello, we are using the Trainer from the transformers library for fine-tuning in our code. Could you please explain why the SFTTrainer is being used instead?
https://github.com/OpenGVLab/InternVL/blob/main/internvl_chat/internvl/train/internvl_chat_finetune.py#L816
I see. I was using SFTTrainer since I previously trained LLMs with it, so I'm wondering how to minimally change the code to enable SFTTrainer for InternVL.
Another thing I found is that there is no apply_chat_template implemented for InternVL, which makes it not easy to integrate it to the code base with other models having apply_chat_template defined. Are you planning to add the apply_chat_template function to InternVL? Thanks!
Thank you for you suggestions! We will try to implement apply_chat_template as soon as possible.