TC-GNN_ATC23
TC-GNN_ATC23 copied to clipboard
embedding_dim/BLK_H
hi,I read the spmm kernel and find that the number of tiles to process for the feature dimension is calculated by embedding_dim/BLK_H, which may not be applicable for the last GCN layer, because at that point the embedding dimension is often not a multiple of BLK_H. Then the results may not be correct?
Thanks for your interest in our work. Usually, for those non-divisible cases, we will consider padding by rounding the non-divisible dimension to the round-up divisible number, for instance, if the original non-divisible dimension is 14 while the block_H is 8, then the zero-padding should be applied to make the original dimension from 14 to (14 + 8 - 1)//8 * 8 = 16. Please note to keep track of the non-zero index when you read your outputs.
Is this padding operation done in shared memory or is it done on raw data?
I suggest you try both, where shared memory is more runtime efficient but requires more kernel re-implementation.