Hi @HimariO , @nikhilbchilwant
First of thanks a lot for this great work and making this code open source.
I am trying to reproduce your results over the hatefulmemedataset provided, but getting stuck because of following error. Will you please help me out with this
Traceback (most recent call last):
File "train_meme_itm.py", line 802, in
run_main_train(args)
File "train_meme_itm.py", line 705, in run_main_train
main(args)
File "train_meme_itm.py", line 470, in main
loss = model(batch, compute_loss=True)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/opt/conda/lib/python3.6/site-packages/apex/amp/_initialize.py", line 177, in new_fwd
**applier(kwargs, input_caster))
File "/src/model_villa/vqa.py", line 43, in forward
output_all_encoded_layers=False)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/src/model_villa/model.py", line 385, in forward
output_all_encoded_layers=output_all_encoded_layers)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/src/model_villa/model.py", line 295, in forward
hidden_states = layer_module(hidden_states, attention_mask)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/src/model_villa/layer.py", line 167, in forward
attention_output = self.attention(hidden_states, attention_mask)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/src/model_villa/layer.py", line 125, in forward
self_output = self.self(input_tensor, attention_mask)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/src/model_villa/layer.py", line 85, in forward
attention_scores = torch.matmul(query_layer, key_layer.transpose(-1, -2))
RuntimeError: cublas runtime error : unknown error at /tmp/pip-req-build-l1dtn3mo/aten/src/THC/THCBlas.cu:390