LAVIS icon indicating copy to clipboard operation
LAVIS copied to clipboard

Error in beam-search multinomial sampling

Open Richar-Du opened this issue 2 years ago • 1 comments
trafficstars

According to the transformer in Huggingface, beam-search multinomial sampling can be implemented by setting num_beams>1 and do_sample=True. However, this is not supported in LAVIS. If I set num_beams=4, num_return_sequences=4 and do_sample=True simultaneously, there is an error as follows:

File "MM/LAVIS/lavis/models/med.py", line 1405, in generate_from_encoder
    outputs = self.generate(
  File "miniconda3/envs/lavis/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "miniconda3/envs/lavis/lib/python3.8/site-packages/transformers/generation_utils.py", line 1404, in generate
    return self.beam_sample(
  File "miniconda3/envs/lavis/lib/python3.8/site-packages/transformers/generation_utils.py", line 2520, in beam_sample
    outputs = self(
  File "miniconda3/envs/lavis/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "MM/LAVIS/lavis/models/med.py", line 1211, in forward
    outputs = self.bert(
  File "miniconda3/envs/lavis/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "MM/LAVIS/lavis/models/med.py", line 974, in forward
    encoder_outputs = self.encoder(
  File "miniconda3/envs/lavis/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "MM/LAVIS/lavis/models/med.py", line 592, in forward
    layer_outputs = layer_module(
  File "miniconda3/envs/lavis/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "MM/LAVIS/lavis/models/med.py", line 475, in forward
    cross_attention_outputs = self.crossattention(
  File "miniconda3/envs/lavis/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "MM/LAVIS/lavis/models/med.py", line 346, in forward
    self_outputs = self.self(
  File "miniconda3/envs/lavis/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "MM/LAVIS/lavis/models/med.py", line 219, in forward
    attention_scores = torch.matmul(query_layer, key_layer.transpose(-1, -2))
RuntimeError: The size of tensor a (36) must match the size of tensor b (4) at non-singleton dimension 0

During generation, the size is normal when generating the first token, both query_layer and key_layer are torch.Size([64, 12, 5, 64]). However, when generating the second token, the size of key_layer become torch.Size([4, 12, 577, 64]). So I think there may be something wrong with the image caption. By the way, 5 is my prompt_length and 12 is the attention head.

Could you figure out where the error is? Thanks in advance :)

Richar-Du avatar Jan 05 '23 10:01 Richar-Du

Hi, @Richar-Du,

I think something fishy might be going on. I will investigate into this. Thanks for raising this.

dxli94 avatar Jan 26 '23 13:01 dxli94

I have meet the same trouble because my version of transformers is incorrect. Maybe you need to check whether your version of transformers is lower than 4.27

nullhty avatar Mar 20 '23 11:03 nullhty