MiniCPM-V
MiniCPM-V copied to clipboard
[BUG] <title>模型文件中resampler.py的代码错误
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
- [x] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
- [x] 我已经搜索过FAQ | I have searched FAQ
当前行为 | Current Behavior
def batch_attn_forward(self, q, k, v, pos_embed_temporal, temporal_ids, key_padding_mask):
bs = k.shape[0] # bs = k.shape[1]
if pos_embed_temporal:
k += torch.stack(pos_embed_temporal, dim=0)
bs = len(temporal_ids)
merge_k = []
merge_v = []
merge_key_padding_mask = []
start = 0
for tp in temporal_ids:
end = start + len(tp)
# # L * (end-start) * D -> (end-start) * L * D -> 1 * L*(end-start) * D
merge_k.append(k[:, start: end, :].permute(1, 0, 2).reshape(-1, self.embed_dim))
merge_v.append(v[:, start: end, :].permute(1, 0, 2).reshape(-1, self.embed_dim))
merge_key_padding_mask.append(key_padding_mask[start: end, :].reshape(-1, 1))
start = end
k = torch.nn.utils.rnn.pad_sequence(merge_k, batch_first=True, padding_value=0.0).permute(1, 0, 2) # L*(end-start)
v = torch.nn.utils.rnn.pad_sequence(merge_v, batch_first=True, padding_value=0.0).permute(1, 0, 2) # L*(end-start)
key_padding_mask = torch.nn.utils.rnn.pad_sequence(merge_key_padding_mask, batch_first=True, padding_value=True).squeeze(-1)
out = self.attn(
self._repeat(q, bs), # Q * B * D
k, # L * B * D + L * B * D
v,
key_padding_mask=key_padding_mask)[0]
return out
期望行为 | Expected Behavior
这里第一行代码bs应该是k.shape[1],如果外部没有传入temporal_ids,bs就会出现错误,被设置成图像token的长度
复现方法 | Steps To Reproduce
No response
运行环境 | Environment
- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`):
备注 | Anything else?
No response
huggingface仓库的代码已经修复了这个文图
你好,可以更新一下 huggingface 仓库的最新代码