Vijay Prakash Dwivedi
Vijay Prakash Dwivedi
Hi @sperfu, it is part of the softmax term. Please refer to [this issue](https://github.com/graphdeeplearning/graphtransformer/issues/1) for the pointers to the explanation.
Hi @AmeenAli, the models that we train are not on any large datasets (and rather medium-scale benchmark datasets), and we do not apply the checkpoints on any transfer learning, we...
Hi @GregorKobsik, When we use spare Graph Transformer (using the original sparse adj matrix), the memory consumption is _O(E)_. With the Fully-connected graph, the memory consumption becomes _O(N^2)_, where _N_...
Hi @nashid we do not have plan to apply the graph transformer architecture to machine translation.
Hi @jermainewang, Thank you for your message. Sure, we will share the datasets in DGL as well. I will follow up with you on this and update you when ready.
Thanks a lot @SauravMaheshkar @clefourrier! @clefourrier, sure. My username is [vijaypradwi](https://huggingface.co/vijaypradwi). [I will check the steps linked in above comments for the HF datasets, as I haven't used before :')]
Thanks @Mang30 for pointing this issue which was due to a pos_enc placeholder used to conduct prototypical experiments. It has been fixed by https://github.com/snap-research/LargeGT/pull/4.