Zhe Chen comments

Results 316 comments of


                                            Zhe Chen

I do all thing to deploy a demo on radio, but the server can not load the model

Is there any error message when starting the model worker?

I do all thing to deploy a demo on radio, but the server can not load the model

Hello, thank you for your attention. You can now deploy the InternVL2 model following this document: [https://internvl.readthedocs.io/en/latest/internvl2.0/deployment.html](https://internvl.readthedocs.io/en/latest/internvl2.0/deployment.html)

ImportError: libtorch_cuda_cu.so: cannot open shared object file: No such file or directory

Hi, has this problem been solved?

3* 4090GPU OOM

The automatic allocation scheme of device_map='auto' of transformers may not be reasonable, in which case you can try manually allocating GPU memory to achieve maximum utilization, for example: ```python device_map...

Strange loss curve when `num_train_epoch`>1

As far as I know, this phenomenon is very common when training large models. This reflects the overfitting of the model to the training set to some extent.

3* 4090GPU OOM

> Would two 4090s be enough? I have a 3090, thinking on getting another one. Hello, thank you for your attention. You can now deploy the InternVL2 model following this...

对身份证中住址等多行信息识别返回不准确

因为训练数据中很少有身份证的数据，所以处理的还不够好

1.5最大窗口长度只有2048吗？可不可以设置的更长比如4096

训练的最大窗口是4096，推理时可以扩大到10k，测试过没问题。如果是在demo上，可以通过调整Max output tokens来控制：

1.5最大窗口长度只有2048吗？可不可以设置的更长比如4096

我觉得不需要重头预训练，4k训练的模型直接扩大到8k-10k没有大问题，如果想扩大到更大的长度，可能需要再用长数据做一下微调。另外您可以试试我们最近发布的[Mini-InternVL-Chat-2B-V1-5](https://huggingface.co/OpenGVLab/Mini-InternVL-Chat-2B-V1-5)和[Mini-InternVL-Chat-4B-V1-5](https://huggingface.co/OpenGVLab/Mini-InternVL-Chat-4B-V1-5)，这两个模型都是在8k长度下做的SFT。

1.5最大窗口长度只有2048吗？可不可以设置的更长比如4096

4B的这个问题是Phi3语言模型本身的问题，因为Phi3的词表太小，对中文支持很烂。目前看下来完全没救，以后也会避免使用Phi3来训练模型