InternVL
InternVL copied to clipboard
Cannot load huggingface internvl3.5 with flash_attn
Checklist
- [x] 1. I have searched related issues but cannot get the expected help.
- [x] 2. The bug has not been fixed in the latest version.
- [x] 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
Describe the bug
When I tried to load internvl3.5 using transformers:
import math
import torch
from transformers import AutoTokenizer, AutoModel
path = "OpenGVLab/InternVL3_5-8B"
model = AutoModel.from_pretrained(
path,
torch_dtype=torch.bfloat16,
low_cpu_mem_usage=True,
use_flash_attn=True,
trust_remote_code=True,
device_map="auto").eval()
I got an error regarding the "use_flash_attn=True": TypeError: InternVLForConditionalGeneration.__init__() got an unexpected keyword argument 'use_flash_attn'
Reproduction
import math
import torch
from transformers import AutoTokenizer, AutoModel
path = "OpenGVLab/InternVL3_5-8B"
model = AutoModel.from_pretrained(
path,
torch_dtype=torch.bfloat16,
low_cpu_mem_usage=True,
use_flash_attn=True,
trust_remote_code=True,
device_map="auto").eval()
Environment
I am not using lmdeploy. I am using transformers=4.56.2, flash_attn=2.8.3.
Error traceback
This may be because you are using the hf version weights, which calls modeling_internvl.py from the transformers library. The function in the transformers library is maintained by the community and does not yet support the use_flash_attn parameter. If you need to use the use_flash_attn parameter, please use the custom version weights.