Price-matching iGraph's Community Detection Algorithm
Would like to see more support for Community Detection
- [ ] Optimal
- [x] Edge-Betweenness
- [x] Leading Eigenvector
- [ ] Fast-Greedy
- [x] Multi-Level (also https://github.com/igraph/igraph/issues/890)
- [ ] Walktrap
- [x] Label Propagation
- [ ] InfoMap
- [ ] Spinglass
We have some measures that are not on that list. And some that are. and some I'm not familiar with. Are you able to make a pull request with any that you feel most important?
@dschult if reading the iGraph source code can provide context, https://igraph.org/c/doc/igraph-Community.html has some documentation for https://github.com/igraph/igraph Unfortunately there is no pure Python implementation, would like to get people into trying build an alternative though for NetworkX. Some articles regarding such algorithms:
- https://www.r-bloggers.com/summary-of-community-detection-algorithms-in-igraph-0-6/
- https://stackoverflow.com/questions/9471906
- https://yoyoinwanderland.github.io/2017/08/08/Community-Detection-in-Python/
- http://francescopochetti.com/community-detection-social-networks/
Here is a "bible" on many different algorithms and its classification https://arxiv.org/pdf/0906.0612.pdf and its summary https://arxiv.org/pdf/0908.1062.pdf (referenced from https://ffff65535.com/en/q/58d739)
Hi, I want to try this topic. @dschult which algorithm do you think I should start with?
The "bible" paper listed above reports that in their comparisons an algorithm called InfoMap did best. How about starting with that one?
@dschult @guyroznb I'd be very happy to implement InfoMap, I'm quite familiar with it. Is there any ongoing work on this?
There is no ongoing work on InfoMap that I know of. But @guyroznb should update us if they are still working on it.
You guys might want to check this project https://cdlib.readthedocs.io/en/latest/reference/cd_algorithms/node_clustering.html It is written in Python and integrates iGraph with NetworkX. But some of the algorithms are not mainlined into NetworkX. For Infomap:
- https://github.com/uwescience/pyinfomap
- https://github.com/mapequation/infomap-notebooks
- https://github.com/conda-forge/python-infomap-feedstock
Took a look at pyinfomap @BrandonKMLee . The implementation there is along the same lines as what I have been thinking/drafting. I'm thinking a few differences for networkx:
-
We shouldn't rely on page rank of nodes as pyinfomap does for the undirected case. As I understand it, the Map equation for undirected graphs does not require 'teleportation' (tau = 0.15) to derive relative/exit weights (I think because the graph-as-a-Markov chain is irreducible) meaning that in the undirected case we could be more efficient potentially.
-
The 'weight' edge attribute shouldn't be hardcoded, the user should be able to run info map on any edge attribute.
-
The return type should be a list of nodesets by module and optionally the induced subgraphs.