Marcovaldon comments

Results 5 comments of


                                            Marcovaldon

finetune时如何支持单卡组batch训练呢？

I tried to add an extra padding operator before concatenating, then there is another error: > Traceback (most recent call last): File "finetune.py", line 312, in train() File "finetune.py", line...

finetune时如何支持单卡组batch训练呢？

@myownskyW7 @yuhangzang I used only one sample to train with batch>1, there is the same error above.

> Can you provide more details on how you add the padding? After https://huggingface.co/internlm/internlm-xcomposer2-vl-7b/blob/main/modeling_internlm_xcomposer2.py#L266, I implement the padding procedure by below: > longest_token_num = max([wrap_embeds_list[j].shape[1] for j in range(len(img_list))]) for...

finetune时如何支持单卡组batch训练呢？

> You may avoid in-place variable replacement and define other variables instead. 但我删掉自己的改动，用原始代码训练，只有一条数据重复10w次作为数据集，batch>1时仍会有同样的报错信息。

finetune时如何支持单卡组batch训练呢？

> I mean you may avoid the in-place replacement such as `wrap_embeds_list[i] = torch.cat([wrap_embeds_list[i], pad1], dim=1)`, and define new variables, e.g., `wrap_embeds_list_new[i] = torch.cat([wrap_embeds_list[i], pad1], dim=1)` 试过了还是相同的报错信息。想问一下作者在finetune的时候有组batch吗？我用最原始的代码做了一下尝试：训练数据为相同的10w case，此时在https://huggingface.co/internlm/internlm-xcomposer2-vl-7b/blob/main/modeling_internlm_xcomposer2.py#L266处组batch时是可以跑过去的（同一条case得到的序列长度肯定是相同的），但在后面做梯度反传的时候还是会有下面的问题： >...