GRACE Scaling to larger datasets

Thanks for your awesome work! I am trying to apply GRACE to larger datasets, but according to your code, the training process is conducted in a full-batch way which hinders the scalability. In your paper, it is mentioned that EIGHT GPUs are used, could you please kindly share the way you implement it? As far as I know, PyG only supports multi-graph distributed computation. Also, it deserves many thanks if you could provide me with other suggestions! Looking forward to your reply!!

Jun 22 '22 00:06 sunisfighting

Thanks for your interest on our work! Actually we use 8 GPUs in parallel. That being said, we use one GPU for one dataset. For the support of multiple GPUs, I think it is fairly easy to adopt existing libraries 😄

Jun 22 '22 03:06 SXKDZ

Thanks for your kind reply. Here is one more question. In GRACE/model, you first double the hidden layer dimension (e.g. 256) in the middle GCNConv layers and then scale to the original dimension (e.g. 128) in the last layer. I also noticed that in your PyGCL library, the hidden dimensions are kept unchanged (i.e. 128 for all hidden layer). And when run GRACE, it may bring 1-2% points of acc drop if keep dimension as 128. Which operation is more standard and fairer for comparison with competitors?

Jun 24 '22 08:06 sunisfighting

That's a good question. Theoretically, every hidden dimension could be a tunable hyperparameter, so doubling the size of hidden vectors is acceptable in my opinion. I think for a fair comparison with other models, you need to make sure the encoder part is the same. In case of the performance drop as you mentioned, it may be attributed to relatively small size of the dataset.

Nov 10 '22 04:11 SXKDZ

GRACE GRACE copied to clipboard

Scaling to larger datasets

GRACE
GRACE copied to clipboard