Results 6 comments of DLight

hi,如果直接全量放开可能会影响本身vit的能力,所以我们使用了layer wise lr decay让vit的浅层改动较小,深层adapt到新的任务上,这部分code在整理中,近期会开源

The size of tensor a (1377) must match the size of tensor b (1376) at non-singleton dimension 3 this is commonly caused by the transformer version difference, try  transformers==4.33.2

https://github.com/InternLM/InternLM-XComposer?tab=readme-ov-file#inference-on-multiple-gpus try this one

试一下把transformers版本改成4.33.1