mPLUG-Owl cur_input_embeds = torch.cat([cur_input_embeds_1, cur_image_features[0:0], cur_input_embeds_2], dim=0),其中cur_image_features[0:0]表示这是一个没有维度的向量，图像的特征并没有真正加进去

cur_input_embeds = torch.cat([cur_input_embeds_1, cur_image_features[0:0], cur_input_embeds_2], dim=0),其中cur_image_features[0:0]表示这是一个没有维度的向量，图像的特征并没有真正加进去

Open hangzeli05 opened this issue 1 year ago • 1 comments

mPLUG-Owl2中的代码错误

Dec 11 '23 03:12 hangzeli05

No, it is for compatible with deepspeed zero3 during training on text samples. For multi-modal input, this would not encounter.

Dec 18 '23 14:12 vateye