requirements specification
Hi, while trying to run inference rationale generation, I encountered this first issue :
self.mha_layer = torch.nn.MultiheadAttention(embed_dim=config.hidden_size, kdim=config.hidden_size, vdim=config.hidden_size, num_heads=1, batch_first=True)
TypeError: __init__() got an unexpected keyword argument 'batch_first'
then commented the involved parameter and ran to this second issue :
File "/home/l1094547/.conda/envs/vmmcot/lib/python3.8/site-packages/torch/nn/functional.py", line 4079, in multi_head_attention_forward
k = k.contiguous().view(-1, bsz * num_heads, head_dim).transpose(0, 1)
RuntimeError: shape '[-1, 512, 768]' is invalid for input of size 307200
I believe the real problem here is my torch version is not the one required. Could you add it in the requirements ? The usual conda yaml file would be perfection but simply knowing your torch version might do the trick.
Thanks a lot for your work
(Also can you indicate your python version in the process, many thanks)
Hi,
I guess the required version is actually the one indicated in the ScienceQA git requirements :
As we can see here, I made my torch and cuda frameworks exactly matching these requirements and yet still get somme shape errors :
File "x/lib/python3.8/site-packages/torch/nn/functional.py", line 5122, in multi_head_attention_forward
k = k.contiguous().view(k.shape[0], bsz * num_heads, head_dim).transpose(0, 1)
RuntimeError: shape '[4, 512, 768]' is invalid for input of size 307200
I will then create a new issue focusing on those shape errors.