Zhe Chen comments

Results 316 comments of


                                            Zhe Chen

大佬，如何重新训练InternViT‑300M‑448px模型，有开源代码吗

> 能请问一下什么时候开源这部分的代码吗？您好，感谢您的关注。目前我们暂时没有计划开源从零训练ViT的代码。

Questions about the pre-training data format

Hello, the pre-training weights from the first stage are essentially the MLP projector, and we will release them shortly. Additionally, the data format for our pre-training is consistent with the...

Questions about the pre-training data format

Hello, we are planning to release some pre-trained OCR data, but the dataset is quite large, consisting of tens of millions of entries, so it will take some time to...

[Bug] 断点二次训练merge lora失败

这个bug看着好像可以通过调整一下权重的device来解决

请问internvl2还会单独发布VIT和LLM部分吗

感谢提问，InternVL2用的ViT还是InternVL1.5的ViT，就是这个： https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-5 我们在InternVL2这个版本中没有对ViT做额外的增量预训练。

Add `apply_chat_template` functionality

Thank you for your suggestion. We will implement the `apply_chat_template` function in the next few days.

同样的客户端调用脚本，多张图片推理，InternVL2-Llama3-76B-AWQ运行正常，InternVL2-Llama3-76B运行出错返回内容为空，token溢出

这还挺奇怪，我没遇到过这个问题。请问您的环境中用安装apex和flasn attn吗。如果有安装apex，建议卸载了；如果没安装flash attn可以安装一个试试看，因为lmdeploy中运行vlm的ViT部分应该还是跑的pytorch后端。

[Feature] When will support run on ollama to drive Interval’s global reach?

We plan to complete the integration of ollama within October, thank you for waiting.

Solution for 'FlashAttention only supports Ampere GPUs or newer' Error on V100 GPUs

Thank you for your feedback

Solution for 'FlashAttention only supports Ampere GPUs or newer' Error on V100 GPUs

Yes, when using V100 GPUs, you can manually disable flash attention by setting `use_flash_attn=False`.