mm-cot requirements specification

Hi, while trying to run inference rationale generation, I encountered this first issue :

self.mha_layer = torch.nn.MultiheadAttention(embed_dim=config.hidden_size, kdim=config.hidden_size, vdim=config.hidden_size, num_heads=1, batch_first=True) 
TypeError: __init__() got an unexpected keyword argument 'batch_first'

then commented the involved parameter and ran to this second issue :

File "/home/l1094547/.conda/envs/vmmcot/lib/python3.8/site-packages/torch/nn/functional.py", line 4079, in multi_head_attention_forward
    k = k.contiguous().view(-1, bsz * num_heads, head_dim).transpose(0, 1)      
RuntimeError: shape '[-1, 512, 768]' is invalid for input of size 307200

I believe the real problem here is my torch version is not the one required. Could you add it in the requirements ? The usual conda yaml file would be perfection but simply knowing your torch version might do the trick.

Thanks a lot for your work

Mar 22 '23 15:03 romain-rsr

(Also can you indicate your python version in the process, many thanks)

Mar 22 '23 15:03 romain-rsr

Hi,

I guess the required version is actually the one indicated in the ScienceQA git requirements :

frameworks compare mm-cot

As we can see here, I made my torch and cuda frameworks exactly matching these requirements and yet still get somme shape errors :

File "x/lib/python3.8/site-packages/torch/nn/functional.py", line 5122, in multi_head_attention_forward
    k = k.contiguous().view(k.shape[0], bsz * num_heads, head_dim).transpose(0, 1)
RuntimeError: shape '[4, 512, 768]' is invalid for input of size 307200

I will then create a new issue focusing on those shape errors.

Mar 27 '23 10:03 romain-rsr