InternVL icon indicating copy to clipboard operation
InternVL copied to clipboard

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Results 461 InternVL issues
Sort by recently updated
recently updated
newest added

When I use the script in huggingface https://huggingface.co/OpenGVLab/InternVL2-26B to test video data, It give me the error: TypeError: chat() got an unexpected keyword argument 'num_patches_list'

Hi, I'm confused I did some visual answering with the InternVL2-26B model and it performs very badly in that. The only model that passes that question are Gemini 1.5 pro/flash,...

When I tried to execute `tokenizer = AutoTokenizer.from_pretrained("/root/app/CustomLLM/InternVL-Chat-V1-5/", trust_remote_code=True)` it said > File "/usr/local/python38/lib/python3.8/site-packages/sentencepiece/__init__.py", line 316, in LoadFromFile > return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg) > RuntimeError: Internal: could not parse ModelProto from...

Hello, I tried using the multi-image conversation as outlined on https://github.com/OpenGVLab/InternVL/blob/764fdc9f3ee102bc6c2def02c2d0ca1e94336d06/README.md?plain=1#L627-L634 With the two image example, I am able to reproduce the results seen in (https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5#model-usage). However, when I try...

1:我执行命令 lmdeploy serve api_server /root/autodl-tmp/InternVL-Chat-V1-5 --server-port 8080 `` (internvl-deploy) root@autodl-container-b2b911ba00-2d4424e7:~# lmdeploy serve api_server /root/autodl-tmp/InternVL-Chat-V1-5 --server-port 8080 FlashAttention is not installed. Special tokens have been added in the vocabulary, make sure...

Please add ```config.attn_config['attn_impl'] = 'triton'``` for Triton Flash Attention Inference ``` import torch from PIL import Image from transformers import AutoModel, AutoConfig, CLIPImageProcessor # Define the model name model_name =...

![屏幕截图 2024-05-29 211054](https://github.com/OpenGVLab/InternVL/assets/150890698/9298f324-bbae-4f60-b188-90744656bc1b)

Hi everyone, This is a **Common Issue Summary** where I will compile the frequently encountered issues. If you notice any omissions, please feel free to help add to the list....

Add support for the gemma2 language model during the process of multimodal model fusion. The fine-tuning process has been successfully completed under 8*H100, hoping to make a contribution.

### Motivation 模型不支持Tools Calling吗 ### Related resources _No response_ ### Additional context _No response_