TC-GNN_ATC23 icon indicating copy to clipboard operation
TC-GNN_ATC23 copied to clipboard

Cuda Graph optimization

Open plant310 opened this issue 1 year ago • 2 comments

hi,I used nsight system to view the timeline after using cuda graph and found that the spmm kernels in the forward and backward passes were clustered together, which seems to break the logic of the program. Is there any solution for this? image

plant310 avatar Sep 27 '23 09:09 plant310

Do these two SpMM functions correspond to the two-layer forward of the GCN model?

YukeWang96 avatar Sep 27 '23 17:09 YukeWang96

The dependency in combination and aggregation operation seems to be broken. And I compare the test accuracy with and without the cuda graph optimization, it looks like that cuda graph optimization makes the test accuracy drop to a very low level

plant310 avatar Sep 28 '23 06:09 plant310