gcn_clustering
gcn_clustering copied to clipboard
pred will contain nan if k_at_hop is large
Thank you very much for your inspiring work!
As suggested in the paper, "In the testing phase, it is not necessary to keep the same configuration with the training phase. ", setting k_at_hop=[20,5] of test.py
is reasonable for fast testing. But the pred and loss seem to become nan if k_at_hop=[200,10]. May I ask whether this phenomena is reproduced on your side, and why nan occurs?
I find that it is related to here. On test dataset, there may exists some nodes not connecting to any other nodes, causing A/D contains some entry of 1/0. May I ask your advice for this phenomena? What is your consideration to avoid self-loop in graph convolution?
Sorry for my late reply. We follow GraphSAGE on the design of GCN. The reason why we abandon self-loop is that we want to separate the information of the node and its neighbors.
In your case, I think you can simply ignore such nodes (that are not connected to any others). It means the local context cannot be found, so there is no need to predict the linkage with other nodes.
Thank you very much for your advice! It is a good idea to ignore linkage of nan for large k_at_hop and indeed k_at_hop=[20,5] is enough for high performance clustering.
你好,我想问一下,当k_at_hop=[20,5]时,one-hop节点指的应该是knn_graph上的前20个吧?那么当k_at_hop=[50,5]时,one-hop节点就应该是knn_graph上的前50个吧? 但是我发现,在用edge_labels值来表示one-hop节点是否与与中心节点的id相同时,k_at_hop=[50,5]比k_at_hop=[20,5]对应的edge_labels里面的1值更靠后了,这是为什么呢?
@luzai @Zhongdao