InternVL icon indicating copy to clipboard operation
InternVL copied to clipboard

Fine-tune InternVL2 using SFTTrainer from hugging face

Open Liyan06 opened this issue 1 year ago • 3 comments

Is the fine-tuning of InternVL supported by hugging face SFTTrainer?

I got the following error when using the SFTTrainer:


model = AutoModel.from_pretrained(
            "OpenGVLab/InternVL2-8B", 
            device_map="auto",
            torch_dtype=torch.float16,
            trust_remote_code=True
)

tokenizer = AutoTokenizer.from_pretrained(
        "OpenGVLab/InternVL2-8B", 
        trust_remote_code=True
    )

trainer = SFTTrainer(
        model=model,
        args=training_arguments,
        train_dataset=train_dataset,
        dataset_text_field="text", # dummy field
        data_collator=collate_fn,
        max_seq_length=args.max_seq_length,
        tokenizer=tokenizer,
        peft_config=peft_config,
        packing=args.packing,
        dataset_kwargs={"skip_prepare_dataset": True}, 
   )

The issue is probably related to model.language_model.get_input_embeddings() from this issue, but I'm not sure how to fix this here.

Traceback (most recent call last):
  File "/home/lytang/MMCheck/sft.py", line 306, in <module>
    trainer = SFTTrainer(
  File "/data/users/lytang/miniconda3/envs/MM/lib/python3.10/site-packages/huggingface_hub/utils/_deprecation.py", line 101, in inner_f
    return f(*args, **kwargs)
  File "/data/users/lytang/miniconda3/envs/MM/lib/python3.10/site-packages/trl/trainer/sft_trainer.py", line 268, in __init__
    model = get_peft_model(model, peft_config)
  File "/data/users/lytang/miniconda3/envs/MM/lib/python3.10/site-packages/peft/mapping.py", line 179, in get_peft_model
    return PeftModel(model, peft_config, adapter_name=adapter_name, autocast_adapter_dtype=autocast_adapter_dtype)
  File "/data/users/lytang/miniconda3/envs/MM/lib/python3.10/site-packages/peft/peft_model.py", line 164, in __init__
    model = self._prepare_model_for_gradient_checkpointing(model)
  File "/data/users/lytang/miniconda3/envs/MM/lib/python3.10/site-packages/peft/peft_model.py", line 613, in _prepare_model_for_gradient_checkpointing
    model.enable_input_require_grads()
  File "/data/users/lytang/miniconda3/envs/MM/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1761, in enable_input_require_grads
    self._require_grads_hook = self.get_input_embeddings().register_forward_hook(make_inputs_require_grads)
  File "/data/users/lytang/miniconda3/envs/MM/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1780, in get_input_embeddings
    raise NotImplementedError
NotImplementedError

Liyan06 avatar Sep 04 '24 21:09 Liyan06

Hello, we are using the Trainer from the transformers library for fine-tuning in our code. Could you please explain why the SFTTrainer is being used instead?

https://github.com/OpenGVLab/InternVL/blob/main/internvl_chat/internvl/train/internvl_chat_finetune.py#L816

czczup avatar Sep 06 '24 07:09 czczup

I see. I was using SFTTrainer since I previously trained LLMs with it, so I'm wondering how to minimally change the code to enable SFTTrainer for InternVL.

Another thing I found is that there is no apply_chat_template implemented for InternVL, which makes it not easy to integrate it to the code base with other models having apply_chat_template defined. Are you planning to add the apply_chat_template function to InternVL? Thanks!

Liyan06 avatar Sep 06 '24 16:09 Liyan06

Thank you for you suggestions! We will try to implement apply_chat_template as soon as possible.

Weiyun1025 avatar Sep 10 '24 16:09 Weiyun1025