sixgod-666

Results 9 comments of sixgod-666

File "/workspace/envs/vary/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 285, in set_module_tensor_to_device raise ValueError( ValueError: Trying to set a tensor of shape torch.Size([1024, 1024]) in "weight" (which has shape torch.Size([2048, 1024])), this look incorrect. 加载模型时,出现这个错误怎么办呢

Traceback (most recent call last): File "/workspace/Vary-toy-main/Vary-master/vary/demo/run_qwen_vary.py", line 126, in eval_model(args) File "/workspace/Vary-toy-main/Vary-master/vary/demo/run_qwen_vary.py", line 43, in eval_model model = varyQwenForCausalLM.from_pretrained(model_name, low_cpu_mem_usage=True, trust_remote_code=True) File "/workspace/envs/vary/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3091, in from_pretrained ) =...

奥奥,我没有重新编译,感谢感谢

同求复现论文展示效果的方法

您好,请问怎样调整setting可以做到24G左右,我是V100 32G,在第二阶段的训练中,调整训练参数一直显存溢出,咨询一下还有哪些方法,谢谢 > 用不到A100, 我用过L40训练,但是我没有3090无法测试,但是感觉24G调整下setting是可以训练的

> 你得把max length 调小点? 尝试过了调小至64都不行,所有的batch_size也调成了1,两个vision_tower也冻结了,还有其他的角度可以缩减显存吗,希望能给我提供一些思路 谢谢

是deepspeed 一张卡 这是我的参数您看一下 ![IMG_20240319_170117](https://github.com/Ucas-HaoranWei/Vary-toy/assets/54803343/3f0c82a1-8537-4554-a4d5-6237bbecd231)

所以至少需要两张卡才可以是吗,单卡的话有解决方案吗,感谢感谢