Hi, thanks for your response. I finally managed to train your rrc version model, and also submitted it to the server. The accuracy results are quite similar to yours on...
Hi, does anybody know that how to evaluate the model's memory usage? @Durant35 As you mentioned, just one tensor [1, 12, 402, 354, 128] will need more than 200GB memory,...
Hi, Sorry for me late response, it is the same as sparse GAT.
Please check the paper Neural machine translation by jointly learning to align and translate.