InternVL icon indicating copy to clipboard operation
InternVL copied to clipboard

[Feature]V2PE

Open gejq21 opened this issue 8 months ago • 0 comments

Motivation

Support V2PE in pre-training, fine-tuning, and inference for InternVL.

Modification

V2PE utils

Added the file internvl_chat/internvl/v2pe_utils.py.
It includes the get_rope_pos_id function, which calculates position ids required for V2PE,
and a V2PE module that can replace the conventional RoPE mechanism.

Pre-training and Fine-tuning

Modified the following files to support V2PE in training and fine-tuning:

  • internvl_chat/internvl/train/internvl_chat_pretrain.py
  • internvl_chat/internvl/train/internvl_chat_finetune.py

Specifically:

  • Extended ModelArguments and LazySupervisedDataset to support V2PE.
  • Updated concat_pad_data_collator in internvl_chat/internvl/patch/pad_data_collator.py
    to support passing position ids as lists (to preserve float precision when using V2PE).

Model

InternVLChat

  • Modified internvl_chat/internvl/model/internvl_chat/configuration_internvl_chat.py
    to support passing V2PE-related arguments via config.
  • Updated InternVLChatModel.forward() and InternVLChatModel.chat()
    in internvl_chat/internvl/model/internvl_chat/modeling_internvl_chat.py
    to support V2PE usage during both training and inference.

InternLM2

  • Modified internvl_chat/internvl/model/internlm2/configuration_internlm2.py
    to support V2PE-related config arguments.
  • Updated InternLM2Attention._init_rope(), InternLM2Attention.forward(),
    and InternLM2FlashAttention2.forward() in
    internvl_chat/internvl/model/internlm2/modeling_internlm2.py
    to support replacing RoPE with the V2PE module.

gejq21 avatar Apr 20 '25 04:04 gejq21