cugraph
cugraph copied to clipboard
Updates to make `nx_cugraph.Graph` a drop-in replacement for `nx.Graph`, adds attrs for auto-dispatch for generators
TODO:
- Unit tests
- Improve graph update methods (
add_node()
, et.al.) - Update remaining graph classes
This updates nx-cugraph Graph
and DiGraph
classes to inherit from nx.Graph
, and adds the appropriate cached_properties
to lazily convert and cache to a NetworkX Graph and expose the appropriate dictionaries accordingly. These changes allow a nx_cugraph.Graph
instance to be drop-in compatible with networkx functions that are not yet supported by nx_cugraph.
Combine this with the changes to NetworkX in this PR to auto dispatch generators if they return compatible backend types and allow compatible backend types to fallback to networkx, and users can maximize e2e acceleration for their workflows without code changes.
edgelist_csv = "/datasets/cugraph/csv/directed/cit-Patents.csv"
edgelist_df = pd.read_csv(edgelist_csv, sep=" ", names=["src", "dst"], dtype="int32")
with Timer("from_pandas_edgelist"):
G = nx.from_pandas_edgelist(
edgelist_df, source="src", target="dst", create_using=nx.DiGraph)
print(type(G))
with Timer("number of nodes and edges"):
print(f"{G.number_of_nodes()=}, {G.number_of_edges()=}")
with Timer("pagerank"):
pr = nx.pagerank(G)
with Timer("coloring"):
c1 = nx.coloring.greedy_color(G)
with Timer("coloring (again)"):
c2 = nx.coloring.greedy_color(G)
with Timer("adding a node"):
G.add_edge(0, (3.14159, "string_in_tuple"))
print(type(G))
print(f"{G.number_of_nodes()=}, {G.number_of_edges()=}")
with Timer("re-running pagerank"):
pr2 = nx.pagerank(G)
print(f"new vs. orig nodes: {pr2.keys() - pr.keys()}")
with Timer("pad_graph (this mutates the input graph)"):
cc = nx.coloring.equitable_coloring.pad_graph(G, 11)
print(type(G))
print(f"{G.number_of_nodes()=}, {G.number_of_edges()=}")
with Timer("re-running pagerank"):
pr3 = nx.pagerank(G)
print(f"new vs. orig nodes: {pr3.keys() - pr.keys()}")
Timer.print_total()
No backends used:
(nx) root@8546eec3d49d:~# python zcc_demo.py
from_pandas_edgelist...
Done in: 0:00:50.219987
<class 'networkx.classes.digraph.DiGraph'>
number of nodes and edges...
G.number_of_nodes()=3774768, G.number_of_edges()=16518948
Done in: 0:00:01.851362
pagerank...
Done in: 0:01:10.388206
coloring...
Done in: 0:00:13.802888
coloring (again)...
Done in: 0:00:13.793485
adding a node...
Done in: 0:00:00.000018
<class 'networkx.classes.digraph.DiGraph'>
G.number_of_nodes()=3774769, G.number_of_edges()=16518949
re-running pagerank...
Done in: 0:01:03.532062
new vs. orig nodes: {(3.14159, 'string_in_tuple')}
pad_graph (this mutates the input graph)...
Done in: 0:00:00.000764
<class 'networkx.classes.digraph.DiGraph'>
G.number_of_nodes()=3774771, G.number_of_edges()=16518950
re-running pagerank...
Done in: 0:01:16.790938
new vs. orig nodes: {(3.14159, 'string_in_tuple'), 3774769, 3774770}
Total time: 0:04:50.379710
nx-cugraph backend used - nx-cugraph does not yet support coloring.greedy_color()
or nx.coloring.equitable_coloring.pad_graph()
, note the first call to coloring includes the conversion to a networkx Graph, but the second uses the cached conversion:
(nx) root@8546eec3d49d:~# NETWORKX_BACKEND_PRIORITY=cugraph python zcc_demo.py
from_pandas_edgelist...
Done in: 0:00:00.664462
<class 'nx_cugraph.classes.digraph.DiGraph'>
number of nodes and edges...
G.number_of_nodes()=3774768, G.number_of_edges()=16518948
Done in: 0:00:00.000008
pagerank...
Done in: 0:00:03.741143
coloring...
Done in: 0:01:11.706015
coloring (again)...
Done in: 0:00:11.752219
adding a node...
Done in: 0:00:13.415563
<class 'nx_cugraph.classes.digraph.DiGraph'>
G.number_of_nodes()=3774769, G.number_of_edges()=16518949
re-running pagerank...
Done in: 0:00:00.878451
new vs. orig nodes: {(3.14159, 'string_in_tuple')}
pad_graph (this mutates the input graph)...
Done in: 0:00:13.069187
<class 'nx_cugraph.classes.digraph.DiGraph'>
G.number_of_nodes()=3774771, G.number_of_edges()=16518950
re-running pagerank...
Done in: 0:00:00.896314
new vs. orig nodes: {3774769, 3774770, (3.14159, 'string_in_tuple')}
Total time: 0:01:56.123361
Also note, when debug logging is enabled, you can see calls made from within networkx functions being dispatched appropriately:
pad_graph (this mutates the input graph)...
DEBUG:networkx.utils.backends:no backends are available to handle the call to `pad_graph` with graph types {'cugraph'}
DEBUG:networkx.utils.backends:falling back to backend 'networkx' for call to `pad_graph' with args: (<nx_cugraph.classes.digraph.DiGraph object at 0x7efb84138d60>, 11), kwargs: {}
DEBUG:networkx.utils.backends:using backend 'cugraph' for call to `complete_graph' with args: (2, None), kwargs: {}
DEBUG:networkx.utils.backends:using backend 'cugraph' for call to `relabel_nodes' with args: (<nx_cugraph.classes.graph.Graph object at 0x7efb84139c60>, {0: 3774769, 1: 3774770}, True), kwargs: {}
Done in: 0:00:13.226258