Zhe Chen
Zhe Chen
Is there any error message when starting the model worker?
Hello, thank you for your attention. You can now deploy the InternVL2 model following this document: [https://internvl.readthedocs.io/en/latest/internvl2.0/deployment.html](https://internvl.readthedocs.io/en/latest/internvl2.0/deployment.html)
Hi, has this problem been solved?
The automatic allocation scheme of device_map='auto' of transformers may not be reasonable, in which case you can try manually allocating GPU memory to achieve maximum utilization, for example: ```python device_map...
As far as I know, this phenomenon is very common when training large models. This reflects the overfitting of the model to the training set to some extent.
> Would two 4090s be enough? I have a 3090, thinking on getting another one. Hello, thank you for your attention. You can now deploy the InternVL2 model following this...
因为训练数据中很少有身份证的数据,所以处理的还不够好
训练的最大窗口是4096,推理时可以扩大到10k,测试过没问题。 如果是在demo上,可以通过调整Max output tokens来控制:
我觉得不需要重头预训练,4k训练的模型直接扩大到8k-10k没有大问题,如果想扩大到更大的长度,可能需要再用长数据做一下微调。 另外您可以试试我们最近发布的[Mini-InternVL-Chat-2B-V1-5](https://huggingface.co/OpenGVLab/Mini-InternVL-Chat-2B-V1-5)和[Mini-InternVL-Chat-4B-V1-5](https://huggingface.co/OpenGVLab/Mini-InternVL-Chat-4B-V1-5),这两个模型都是在8k长度下做的SFT。
4B的这个问题是Phi3语言模型本身的问题,因为Phi3的词表太小,对中文支持很烂。目前看下来完全没救,以后也会避免使用Phi3来训练模型