graphwave
graphwave copied to clipboard
when there are millions of nodes,compute the full eigendecomposition of a large matrix will get "MemoryError"
when there are millions of nodes,compute the full eigendecomposition of a large matrix will get "MemoryError" , so this method can't be used to big community.
You can use the Chebyshev option instead of the full eigendecomposition, but it's still not going to be great. I've done a fairly major refactor of this code that scales better, but there are still memory and runtime limitations:
https://github.com/bkj/graphwave
The paper does say that the complexity of this algorithm scales linearly with the number of edges in the graph, but I'm not sure that's actually right.
~ Ben
Hi,
Thank you for your interest in GraphWave! Indeed, the code has to be adapted to handle very large graphs (in particular, using adapted libraries such as snap.py instead of networkx). In particular, the Chebychev approximation will give you an approximation of a function of the eigendecomposition, and is just a polynomial in the adjacency and thus, linear in the number of edges. As such, it is suitable to the handling of sparse, large graphs. However, I did not write the code with gigantic graphs in mind, so there might be other bottlenecks elsewhere (in terms of the library used and code itself). I plan on trying to work on this in the next few months, and will keep you updated on the progress! Best regards,
Claire
I think line 83 in 'graphwave/graphwave/graphwave.py' causes the sparse matrix heat[i] to be densified (specifically the 'heat[i].A'):
temp = thres(heat[i].A) # cleans up the small coefficients
It may be worth finding a way to avoid densifying the matrix to reduce the complexity.