InternVL
InternVL copied to clipboard
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Hi team, thanks for the amazing work! I got the following error when I tried to finetune InternVL-Chat-1.5: [rank0]: Original Traceback (most recent call last): [rank0]: File "/opt/conda/envs/internvl_train2/lib/python3.9/site-packages/torch/nn/parallel/parallel_apply.py", line 83,...
推理中,我使用1-2张图像cat在一起作为输入时,使用8张v100-32G(device_map="auto")能正常进行对话,但使用3张及以上图像(例如10张图像)cat在一起作为输入时,报错: t)orch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 9.14 GiB. GPU 0 has a total capacity of 31.75 GiB of which 8.85 GiB is free. Process 18099 has 22.89...
请问一下,我按照InternViT−6B−448px−V1.5的示例代码,经过图像前处理,发现一个长宽比不为1的图片仍然经过了centercrop,请问如何支持动态分辨率?
I am very impressed with the great work you have done. I have a Mac M2 ultra so can't run the model locally because of the need to use cuda,...
Hello i tried to finetune this model in my custom dataset and get this warning also in when training loss some step is 0, i have 2 questions: - what...
本地推理和提供的demo结果不一致是为什么呢,请问demo具体的参数设置是什么呢
我希望能使用internvl1.5对视频进行问答,可以采取什么方式? 我已尝试过对视频抽帧,并将抽出的多张图像cat在一起作为输入,但过多的图像cat在一起显然会大幅增加我的输入长度,从而在inference的时候显存爆炸。据此,有什么方式能较好使用internvl1.5的对视频输入进行问答吗?
请问可以基于Mini-InternVL−Chat−2B-V1.5模型进行微调吗