mPLUG-Owl
mPLUG-Owl copied to clipboard
mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
Hey, I have a quick question about modeling_mplug_owl.py. What does the following line do?? [https://github.com/X-PLUG/mPLUG-Owl/blob/d9c99101eb55b5a8055356e3e1ed6a9fbbacbbf6/mPLUG-Owl/mplug_owl/modeling_mplug_owl.py#L1408](https://github.com/X-PLUG/mPLUG-Owl/blob/d9c99101eb55b5a8055356e3e1ed6a9fbbacbbf6/mPLUG-Owl/mplug_owl/modeling_mplug_owl.py#L1408) It seems the line is meaningless, or there is some missing variable, like the attention...
We follow the quick start provided in the mPLUG-owl2. There is a error: RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
@LukeForeverYoung @MAGAer13 First of all, thanks for your great work. I have a question regarding the Feed Forward Network (FFN) of the Abstarctor and the forward method of MplugOwlVisualAbstractorAttention. From...
I successfully ran the Quick Start Code,but the programme kept showing up notes like **'The attention mask and pad token id were not set. As a consequence, you may observe...
Owl2中使用的Vit-H-16是从哪个版本初始化的? 可以分享一下这个vit的初始版本权重吗
mPLUG-Owl仓库本身已经提供了非常优秀的微调脚本: [https://github.com/X-PLUG/mPLUG-Owl/tree/main/mPLUG-Owl2/scripts](https://github.com/X-PLUG/mPLUG-Owl/tree/main/mPLUG-Owl2/scripts). ms-swift多模态大模型微调框架集成了mPLUG-Owl2和mPLUG-Owl2.1的推理与微调, 并书写了最佳实践: [https://github.com/modelscope/swift/blob/main/docs/source/Multi-Modal/mplug-owl2%E6%9C%80%E4%BD%B3%E5%AE%9E%E8%B7%B5.md](https://github.com/modelscope/swift/blob/main/docs/source/Multi-Modal/mplug-owl2%E6%9C%80%E4%BD%B3%E5%AE%9E%E8%B7%B5.md) 如果有感兴趣的小伙伴, 可以来使用😊 The mPLUG-Owl2 repository itself provides excellent fine-tuning scripts: [https://github.com/X-PLUG/mPLUG-Owl/tree/main/mPLUG-Owl2/scripts](https://github.com/X-PLUG/mPLUG-Owl/tree/main/mPLUG-Owl2/scripts). The ms-swift multimodal large model fine-tuning framework integrates the inference and fine-tuning of mPLUG-Owl2...
ModuleNotFoundError: No module named 'transformers_modules.mPLUG-Owl2' RuntimeError: Failed to import transformers.models.llama.modeling_llama because of the following error (look up to see its traceback): [Errno 13] Permission denied: '/root/anaconda3' 安装的环境各种报错,运行不了
Hi, I find [Multi-image correlation modelling inference? #1](https://github.com/X-PLUG/mPLUG-Owl/issues/1#issue-1688038504) can implement multi-images input with `mPLUG-Owl`, I want to know how to do it on `mPLUG-Owl2`? Thanks!
在quick_start代码中,我让它描述一下我输入的图像,并且要求它用中文回答我,但是为什么输出全是英文? I ask the model to describe my input image with Chinese, in quick_start code, but the output text is English?Why?