倪樊 comments

Results 5 comments of


                                            倪樊

"GeForce RTX 3090 doesnt response, terminating it."

Dose not work @RadionovM

2Nodes * 8 A100 80G sft full Qwen2VL OOM

> @VincentVanNF 你好，你的问题解决了吗，我现在也碰到了这个问题。 72B lora 的话 4-8卡能跑。我的微调训练集是7w左右，后面尝试用小量的数据大概50条左右可以跑，但是全部的话就会oom，所以不知道72b 全参微调7w数据量到底需要需要多少资源

2Nodes * 8 A100 80G sft full Qwen2VL OOM

> > > @VincentVanNF 你好，你的问题解决了吗，我现在也碰到了这个问题。 > > > > > > 72B lora 的话 4-8卡能跑。我的微调训练集是7w左右，后面尝试用小量的数据大概50条左右可以跑，但是全部的话就会oom，所以不知道72b 全参微调7w数据量到底需要需要多少资源 > > We used 4 * 4 H100 96GB to do full parameter finetune...

如何混合图文数据和纯文本数据训练？

> > 可以参考 https://github.com/hiyouga/LLaMA-Factory/blob/main/examples/train_lora/qwen2vl_lora_sft.yaml 其中 identity 数据集是纯文本 > > 我试了一下，会卡着，不能训。。。load完数据后就卡在那了有更新吗，我也有这个需求，混合数据中纯文本部分尝试将'images'键的值设为空，和直接去掉都不行。

How to train the mm_proj and the LLM part with lora of Qwen2-VL

4* 80 G A100 works for me when training Qwen2-VL-72B with lora. I set cutoff_len=4096 in my exp. And the number of my dataset is about 6.6w. As the issue...