cugraph icon indicating copy to clipboard operation
cugraph copied to clipboard

[FIX] Triangle counting test on asymmetric directed graph produces different results than NetX

Open rlratzel opened this issue 5 years ago • 9 comments
trafficstars

When the email-EU-core.csv dataset is used in the python triangle counting test (python/cugraph/tests/test_triangle_count.py), cugraph and NetX produce a different number of counted triangles and edge vals.

A debug print of the counts returned by both implementations for the above dataset shows the difference:

cu count: 367140, nx count: 316383

rlratzel avatar Aug 07 '20 19:08 rlratzel

I'm seeing some issues with symmetric graphs too. At least tcount_b2b() seems to have issues. From my limited testing, tcount_wrp() appears to be producing the correct output for this test. Test case: com-orkut, which should have 627,584,181 triangles 1 (or times 3 in case of cugraph). I can get this number if use tcount_wrp, not tcount_b2b, which is what is selected based on the avg. degree. Another test that fails is that graph permutations should return the same output, which is also currently broken for tcount_b2b.

pgera avatar Aug 14 '20 09:08 pgera

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

github-actions[bot] avatar Feb 25 '22 20:02 github-actions[bot]

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

github-actions[bot] avatar Apr 09 '22 20:04 github-actions[bot]

NOTE: triangle counting is being refactored to use a new graph primitive, which could result in this issue getting resolved. cc @seunghwak @BradReesWork

rlratzel avatar May 24 '22 21:05 rlratzel

The new triangle counting works only on undirected graphs, and it will throw an exception instead of returning an inaccurate result.

https://github.com/rapidsai/cugraph/blob/branch-22.06/cpp/src/community/triangle_count_impl.cuh#L142

seunghwak avatar May 24 '22 21:05 seunghwak

And I am not sure NetworkX really supports triangle counting on directed graphs.

See

@not_implemented_for("directed")

https://github.com/networkx/networkx/blob/58b63cb57cd1747c23611ee0b46991a5be2db751/networkx/algorithms/cluster.py#L19

And

>>> import networkx as nx
>>> G = nx.DiGraph()
>>> G.add_edge(1,2)
>>> G.add_edge(2,3)
>>> G.add_edge(3,1)
>>> nx.triangles(G)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/seunghwak/Program/anaconda/envs/cuda115-cugraph/lib/python3.8/site-packages/networkx/utils/decorators.py", line 816, in func
    return argmap._lazy_compile(__wrapper)(*args, **kwargs)
  File "<class 'networkx.utils.decorators.argmap'> compilation 4", line 3, in argmap_triangles_1
  File "/home/seunghwak/Program/anaconda/envs/cuda115-cugraph/lib/python3.8/site-packages/networkx/utils/decorators.py", line 86, in _not_implemented_for
    raise nx.NetworkXNotImplemented(errmsg)
networkx.exception.NetworkXNotImplemented: not implemented for directed type

seunghwak avatar May 24 '22 21:05 seunghwak

To support triangle counting on directed graphs, we should define what triangles mean in directed graphs first.

Should we consider only a->b, b->c, c->a or is a->b, a->c, b->c also a triangle?

seunghwak avatar May 24 '22 21:05 seunghwak

As @seunghwak, nx.triangles doesn't support directed graphs, but nx.clustering does support directed (and weighted graphs), so triangle counting on directed graphs is needed. For the clustering coefficient on directed graphs, all triangles orientations are included:

  • 8 possible triangle orientations: https://www.semanticscholar.org/paper/Clustering-in-complex-directed-networks.-Fagiolo/ded16476171d3ac42d8ff071213450356c96cdde/figure/0
  • paper: https://arxiv.org/abs/physics/0612169v3

It's possible to express this algebraically with masked matrix multiplies followed by reduce.

It's unclear whether total triangles for directed triangles is needed for transitivity. This isn't tested in networkx.

eriknw avatar Jun 01 '22 22:06 eriknw

Re-evaluate after #2671

BradReesWork avatar Sep 09 '22 14:09 BradReesWork

The triangle counting algorithm was update in a cuGraph release and now produces the same answer as NetworkX

BradReesWork avatar Feb 02 '23 15:02 BradReesWork