Zhe Chen
Zhe Chen
Hi, since there hasn't been any recent activity on this issue, I'll be closing it for now. If it's still an active concern, don't hesitate to reopen it. Thanks for...
> > > 老哥,你的40B模型部署需要多少显存 > > > > > > 老哥,可以看看这里 https://github.com/Czi24/Awesome-MLLM-LLM-Colab/blob/master/MLLM/InternVL-colab/InternVL.md > > 问一下这里面的显存是模型刚加载好的显存还是推理达到max_tokens时的显存,大概算了一下是刚加载好的显存? 您好,这里的显存是加载好模型之后,简单跑了几次image caption之后的显存,没有推理到max_tokens;如果到了max_tokens,应该还会再额外多占用一些显存。
Hi, since there hasn't been any recent activity on this issue, I'll be closing it for now. If it's still an active concern, don't hesitate to reopen it. Thanks for...
Hello, are you referring to using a vision model and a language model to build an MLLM?
Hi, we added an augment named `use_time_mlp` in `BaseTransformerLayer`, see [here](https://github.com/JiYuanFeng/DDP/blob/main/segmentation/mmseg/models/utils/transformer.py#L183). When it is set to `True`, the `BaseTransformerLayer` module will init a time mlp and calculate scale and shift...
@JiYuanFeng please check this issue.
感谢您的建议!
Hello! I recommend using [vlmevalkit](https://github.com/open-compass/VLMEvalKit) for testing video datasets, as it has already integrated InternVL series models and supports Video-MME and MMBench-Video.
我觉得直接微调就可以了。这类似于学习一种回答的风格
Hi, due to significant quantization errors with BNB 4-bit quantization on InternViT-6B, the model may produce nonsensical outputs and fail to understand images. Therefore, please avoid using BNB 4-bit quantization....