pytorch_geometric icon indicating copy to clipboard operation
pytorch_geometric copied to clipboard

SIGN transformation on multiple GPUs for large graphs

Open ayasar70 opened this issue 2 years ago • 2 comments

🚀 The feature, motivation and pitch

Hello, I was working on SIGN model. In a GPU-based setting, seems that during the preprocessing there can be two bottlenecks; SparseTensor creation and SpMM (https://github.com/pyg-team/pytorch_geometric/blob/master/torch_geometric/transforms/sign.py#L50). Because those operation are going to be on CPUs and of course large graphs cannot fit into a single GPU memory.

Do you think that carrying this computation to multi-GPUs would be helpful? If so, I can work on a CPP extension that takes rows and columns and feature matrix and outputs K layer's SpMM results?

Best

Alternatives

No response

Additional context

No response

ayasar70 avatar Jul 05 '22 21:07 ayasar70

Yes, these two operations are the bottleneck. I think one may be able to scale that onto GPUs by only considering slices, e.g. only considering a subset of feature columns one by one. This should scale SIGN to both single GPU and multi GPU without the need of a multi-GPU SPMM. What do you think?

rusty1s avatar Jul 06 '22 20:07 rusty1s

Agree. Column-wise slicing can be expensive though. Also using the same approach a multi-GPU SpMM can be implemented. In either case CPU-GPU bandwidth and slicing is going to be the bottleneck. I am investigating this problem using Python/Pytorch based and C++-based solutions. I will update you if I observe good speedup.

ayasar70 avatar Jul 11 '22 23:07 ayasar70