InternVL icon indicating copy to clipboard operation
InternVL copied to clipboard

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Results 461 InternVL issues
Sort by recently updated
recently updated
newest added

我尝试用LocalAI的AutoGPTQ backend 加载internvl-chat-v1.5-int8 量化模型,推理代码使用的是InternVL的ReadMe中提供的样例代码。在加载模型时报错: ``` could not load model (no success): Unexpected err=TypeError(\"internvl_chat isn't supported yet.\") ``` 我查看了HF上的模型相关文件,`internvl_chat`似乎是在`config.json`中定义的。如下: ``` "model_type": "internvl_chat", ``` 本地运行的pip依赖如下: - transformers: 4.40.1 - torch: 2.1.2 -...

i only have 4 x 4090 cards, but i can not run internvl-v1-5, which the program will run model on the same card, resulting in OOM. how could i run...

Great work with this. The OCR is on latin text really good. Can you propose how to run this faster. i am already running it on a100 79GB using transformers,...

Hi, thanks for the sharing of this great work. Could you release InterVL1.5 pretrain&finetune code/scripts?

从issue中已经了解了硬件需求 ![image](https://github.com/OpenGVLab/InternVL/assets/31176427/59ea78ef-0e2b-4b83-87da-788b7382f186) ![image](https://github.com/OpenGVLab/InternVL/assets/31176427/8f009c79-3d0f-4204-9b07-c1d7e15ffdc9) 还想了解一下finetune全部LLM和lora分别耗时多久?

有两个问题请教一下: https://huggingface.co/OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-7B/blob/main/pytorch_model.bin.index.json `model.vision_tower.vision_tower.embeddings.position_embedding` 这个权重的 shape 是 `1x577x32000`,但是`InternViT-6B-224px` 里面对应权重的 shape 是 `1x257x32000`,我看到你们在这个PR里面https://github.com/OpenGVLab/InternVL/commit/c82d6ce30f512b33c58615088233a263112ae727 把 `resize_pos_embeddings` 删掉了,按照最新代码train的逻辑,似乎这个尺寸不应该变成577,是hf上面的模型没更新么?(hf上面的模型是tune_vit_pos_embedding 过的?现在这个值好像都是False) 按照llava的加载逻辑(load_pretrained_model),因为LlavaMetaModel中加载vision_tower的时候delay_load=True,所以vision_tower最后加载,且不会管llm模型中vision_tower相关的权重, 如果tune_vit_pos_embedding的话,那么这里的逻辑应该有问题吧。

when i want to continue finetune internvl chat, an error occured: ```groovy torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 280.00 MiB ``` i specify the batchsize into 1 by...

Hi! Thanks for the great work. I am working on the multi-model Vision-QA tasks using internvl_chat_llava model. When evaluate the internvl_chat_llava model, an error occurs: ``` Traceback (most recent call...