pytorch-question-answering icon indicating copy to clipboard operation
pytorch-question-answering copied to clipboard

RuntimeError: mat1 dim 1 must match mat2 dim 0

Open sathsaraRasantha opened this issue 4 years ago • 5 comments

First of all, you have done a fantastic work here and I would like to thank you for that. I am trying to use your implementation on a different QA dataset ( Translated version of SQuAD 1.0 to Sinhala language ). The only changes I made was using different dataset ( But in same format ), using Google colab and using FastText word embeddings instead of Glove. I am getting a error when trying to call the train function. I didn't make any changes to "class BiDAF" or train function. This is the error I am getting.

Starting training ........ Starting batch: 0

RuntimeError Traceback (most recent call last) in () ----> 1 train(model, train_dataset)

9 frames /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in linear(input, weight, bias) 1690 ret = torch.addmm(bias, input, weight.t()) 1691 else: -> 1692 output = input.matmul(weight.t()) 1693 if bias is not None: 1694 output += bias

RuntimeError: mat1 dim 1 must match mat2 dim 0

Can you take a look at this whenever you have a free time. It would be great if you can help me with this.

sathsaraRasantha avatar Dec 09 '20 10:12 sathsaraRasantha

Hi @sathsaraRasantha. I have not been able to look at your issue due to other commitments. I'll try to take a look this weekend. It would be better if you could share your notebook as colab (or any other way), so that I can debug it easily. Also, I would suggest you to take a look at the shapes of your tensors at each step since that's what going wrong for you somewhere.

kushalj001 avatar Dec 15 '20 13:12 kushalj001

Hi @kushalj001. Thank you so much for replying. I solved that issue by looking at shapes of tensors as you also mentioned. But now there is a different error. I'll share the link of colab notebook. I am sure you would be able to debug it easily. I made some changes in data preprocessing steps as well. I just get the feeling that the error is something related to that.

This is the error I am getting ..................................................................................................................................................................................................................................................................... Starting training ........ Starting batch: 0

IndexError Traceback (most recent call last) in () ----> 1 train(model, train_dataset)

2 frames in make_char_vector(self, max_sent_len, max_word_len, sentence) 23 for i, word in enumerate(nlp(sentence, disable=['parser','tagger','ner'])): 24 for j, ch in enumerate(word.text): ---> 25 char_vec[i][j] = char2idx.get(ch, 0) 26 27 return char_vec

IndexError: index 191 is out of bounds for dimension 0 with size 191

.....................................................................................................................................................................................................................................................................

This is the link to the colab notebook: https://colab.research.google.com/drive/1zBn-jU_y-NbBOXR_eAi6j1lPxO-EhMSa#scrollTo=pLwqIfRtqu8k

I really appreciate your reply to my issue and I hope you can help me with this too. Thanks in advance!!

sathsaraRasantha avatar Dec 15 '20 16:12 sathsaraRasantha

Hey..I found the error there also. And fixed it. Now I am getting another one. Could you please send me email whenever you are free. My email : [email protected]

This is the error I am getting. And it seems this is the last one.

Starting training ........ Starting batch: 0

RuntimeError Traceback (most recent call last) in () ----> 1 train(model, train_dataset)

in train(model, train_dataset) 16 context, question, char_ctx, char_ques, label, ctx_text, ans, ids = batch 17 ---> 18 context, question, char_ctx, char_ques, label = context.to(device), question.to(device), char_ctx.to(device), char_ques.to(device), label.to(device) 19 20

RuntimeError: CUDA error: device-side assert triggered

sathsaraRasantha avatar Dec 24 '20 02:12 sathsaraRasantha

Hi, @sathsaraRasantha did you find a solution for your last question? The same error has encountered me also.

Marwa-1995 avatar Feb 02 '21 14:02 Marwa-1995

Hi @sathsaraRasantha @Marwa-1995 The error is most likely related to some wrong dimension/axis in a tensor being accessed. In order to get a more readable error message, turn of your GPU and run the code on CPU. That will give the exact line where the code is breaking.

kushalj001 avatar Feb 02 '21 16:02 kushalj001