cugraph
cugraph copied to clipboard
[FIX] Triangle counting test on asymmetric directed graph produces different results than NetX
When the email-EU-core.csv dataset is used in the python triangle counting test (python/cugraph/tests/test_triangle_count.py), cugraph and NetX produce a different number of counted triangles and edge vals.
A debug print of the counts returned by both implementations for the above dataset shows the difference:
cu count: 367140, nx count: 316383
I'm seeing some issues with symmetric graphs too. At least tcount_b2b() seems to have issues. From my limited testing, tcount_wrp() appears to be producing the correct output for this test. Test case: com-orkut, which should have 627,584,181 triangles 1 (or times 3 in case of cugraph). I can get this number if use tcount_wrp, not tcount_b2b, which is what is selected based on the avg. degree. Another test that fails is that graph permutations should return the same output, which is also currently broken for tcount_b2b.
This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.
This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.
NOTE: triangle counting is being refactored to use a new graph primitive, which could result in this issue getting resolved. cc @seunghwak @BradReesWork
The new triangle counting works only on undirected graphs, and it will throw an exception instead of returning an inaccurate result.
https://github.com/rapidsai/cugraph/blob/branch-22.06/cpp/src/community/triangle_count_impl.cuh#L142
And I am not sure NetworkX really supports triangle counting on directed graphs.
See
@not_implemented_for("directed")
https://github.com/networkx/networkx/blob/58b63cb57cd1747c23611ee0b46991a5be2db751/networkx/algorithms/cluster.py#L19
And
>>> import networkx as nx
>>> G = nx.DiGraph()
>>> G.add_edge(1,2)
>>> G.add_edge(2,3)
>>> G.add_edge(3,1)
>>> nx.triangles(G)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/seunghwak/Program/anaconda/envs/cuda115-cugraph/lib/python3.8/site-packages/networkx/utils/decorators.py", line 816, in func
return argmap._lazy_compile(__wrapper)(*args, **kwargs)
File "<class 'networkx.utils.decorators.argmap'> compilation 4", line 3, in argmap_triangles_1
File "/home/seunghwak/Program/anaconda/envs/cuda115-cugraph/lib/python3.8/site-packages/networkx/utils/decorators.py", line 86, in _not_implemented_for
raise nx.NetworkXNotImplemented(errmsg)
networkx.exception.NetworkXNotImplemented: not implemented for directed type
To support triangle counting on directed graphs, we should define what triangles mean in directed graphs first.
Should we consider only a->b, b->c, c->a or is a->b, a->c, b->c also a triangle?
As @seunghwak, nx.triangles doesn't support directed graphs, but nx.clustering does support directed (and weighted graphs), so triangle counting on directed graphs is needed. For the clustering coefficient on directed graphs, all triangles orientations are included:
- 8 possible triangle orientations: https://www.semanticscholar.org/paper/Clustering-in-complex-directed-networks.-Fagiolo/ded16476171d3ac42d8ff071213450356c96cdde/figure/0
- paper: https://arxiv.org/abs/physics/0612169v3
It's possible to express this algebraically with masked matrix multiplies followed by reduce.
It's unclear whether total triangles for directed triangles is needed for transitivity. This isn't tested in networkx.
Re-evaluate after #2671
The triangle counting algorithm was update in a cuGraph release and now produces the same answer as NetworkX