[BUG] get_vllm_embedding中的patch_attn_mask计算有问题
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
- [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
- [X] 我已经搜索过FAQ | I have searched FAQ
当前行为 | Current Behavior
def get_vllm_embedding(self, data):
if 'vision_hidden_states' not in data:
dtype = self.vpm.embeddings.position_embedding.weight.dtype
device = self.vpm.embeddings.position_embedding.weight.device
tgt_sizes = data['tgt_sizes']
pixel_values_list = data['pixel_values']
best_grid = data["best_grid"]
vision_hidden_states = []
all_pixel_values = []
img_cnt = []
for pixel_values in pixel_values_list:
img_cnt.append(len(pixel_values))
all_pixel_values.extend([i.flatten(end_dim=1).permute(1, 0) for i in pixel_values])
# exist image
if all_pixel_values:
tgt_sizes = torch.vstack(tgt_sizes).type(torch.int32)
if self.config.batch_vision_input:
max_patches = torch.max(tgt_sizes[:, 0] * tgt_sizes[:, 1])
all_pixel_values = torch.nn.utils.rnn.pad_sequence(all_pixel_values, batch_first=True,
padding_value=0.0)
B, L, _ = all_pixel_values.shape
all_pixel_values = all_pixel_values.permute(0, 2, 1).reshape(B, 3, -1, L)
patch_attn_mask = torch.zeros((B, 1, max_patches), dtype=torch.bool, device=device)
for i in range(B):
patch_attn_mask[i, :tgt_sizes[i][0] * tgt_sizes[i][1]] = True
patch_attn_mask计算出现问题,索引出错,导致patch_attn_mask全为true
上图的i=4时,有17个padding,应当最后17个为False,但patch_attn_mask最后的结果全为True
https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5/blob/main/modeling_minicpmv.py#L97
期望行为 | Expected Behavior
patch_attn_mask[i, :tgt_sizes[i][0] * tgt_sizes[i][1]] = True
修改为
patch_attn_mask[i, 0,:tgt_sizes[i][0] * tgt_sizes[i][1]] = True
复现方法 | Steps To Reproduce
No response
运行环境 | Environment
- OS:ubuntu 20.04
- Python: Python 3.10.14
- Transformers: 4.40.0
- PyTorch:2.1.2
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`): 12.1
备注 | Anything else?
No response
感谢反馈,我们正在评估影响
你好,这确实是一个 mistake,感谢反馈,为了保证训练和推理的一致性,我们不直接修改 hf 上的代码了,我们会在后续的模型发布中系统性地修复这个问题
你好,这确实是一个 mistake,感谢反馈,为了保证训练和推理的一致性,我们不直接修改 hf 上的代码了,我们会在后续的模型发布中系统性地修复这个问题 @YuzaChongyi Can you fully assess the impact? We are already fine-tuning the model and applying it to production. Or, when the next model will be released?
There is no problem if the behavior of patch_attn_mask is consistent during the training and inference. We also try to modify it directly, which basically does not change the inference results. This version will not be updated to keep the evaluation results reproducible.
The release date of the next model is not certain yet,we are working for it.