InternVL issues

想问下，模型pretrain的时候用了那个类似UHD的切图吗？

6

如题。。如果pretrain就把图片切那么多份，训练成本是不是有些cover不住

Image transformation for InternVL-1.5

1

I found that there is a image transformation step on load_image function from the example on huggingface repo (transformers based), but there is not any image processing on the gradio_web_server...

ruifengma

Can swft fine-tuning be done in InternVL−Chat−V1.5-Int8 version?

1

wangdong1992

--freeze_backbone False?

2

Why does the file internvl_chat_v1_2_hermes2_yi34b_448_finetune.sh include --freeze_backbone False? Isn't the visual encoder supposed to be frozen during the pre-training phase?

fyting

I do all thing to deploy a demo on radio, but the server can not load the model

4

chengfengke

ImportError: libtorch_cuda_cu.so: cannot open shared object file: No such file or directory

2

Help with this. Thank you ImportError: libtorch_cuda_cu.so: cannot open shared object file: No such file or directory Python 3.8.13 (default, Oct 21 2022, 23:50:54) [GCC 11.2.0] :: Anaconda, Inc. on...

2390968687

3* 4090GPU OOM

3

Why do 3* 4090GPUs still out of memory (24*3>52GB) 0 NVIDIA GeForce RTX 4090 Off | 00000000:31:00.0 Off | Off | | 66% 24C P8 22W / 450W | 42MiB...

orderer0001

Bug in multi-image conversation ( Only Support Single Image Conversation)

Thanks for your great job! I follow your tutorial in [https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5-Int8](url) and I found that the model only support single image conversation. I use the Int8 model. for example, i...

BeiningWu

Strange loss curve when `num_train_epoch`>1

1

Thank you for releasing this wonderful work and keep updating the latest scripts for training and fine-tuning! Recently I have tried to fine-tune the InternVL-V1.5 using custom dataset, and I...

ChorlingLau

对身份证中住址等多行信息识别返回不准确

1

如果最后一行为两个字或者很少的字，会被整理成分类名称，导致返回的json内容中地址信息缺少部分信息这是提示词的问题，还是模型的问题，不知道如何优化，给指教指教 ![image](https://github.com/OpenGVLab/InternVL/assets/74588507/e697b598-bd04-4931-8499-5f289408a9ca)

w7team

InternVL
InternVL copied to clipboard

Metadata

想问下，模型pretrain的时候用了那个类似UHD的切图吗？

Image transformation for InternVL-1.5

Can swft fine-tuning be done in InternVL−Chat−V1.5-Int8 version?

--freeze_backbone False?

I do all thing to deploy a demo on radio, but the server can not load the model

ImportError: libtorch_cuda_cu.so: cannot open shared object file: No such file or directory

3* 4090GPU OOM

Bug in multi-image conversation ( Only Support Single Image Conversation)

Strange loss curve when `num_train_epoch`>1

对身份证中住址等多行信息识别返回不准确

← Metadata

Owner

Metadata

InternVL InternVL copied to clipboard

Metadata

← Metadata

Owner

Metadata

InternVL
InternVL copied to clipboard