`all_node_cuts()` performance degrades much faster than predicted by Kanevsky's minimum cut algorithm
NetworkX vs igraph. igraph implements Kanevsky's too.
import timeit
import igraph as ig
import networkx as nx
def bigO(g):
return (2 ** nx.node_connectivity(g)) * (len(g) ** 3)
def runit(i):
g = nx.grid_2d_graph(i, i)
O = bigO(g)
h = ig.Graph.from_networkx(g)
t = timeit.timeit(lambda: list(nx.all_node_cuts(g)), number=1)
t2 = timeit.timeit(lambda: ig.GraphBase.minimum_size_separators(h), number=1)
return dict(i=i, k=nx.node_connectivity(g), n=len(g), O=O, nx_t=t, ig_t=t2)
if __name__ == "__main__":
results = []
for i in range(2, 12):
r = runit(i)
results.append(r)
print(r)
Results dataframe w/ percent change calculated for nx_t and ig_t (* 100, b/c humans). Rate of growth is the interesting bit, b/c NetworkX is 100% Python, and igraph is C w/ a python binding. Raw data attached results.json.
i k n O nx_t ig_t nx_t_pct_chg ig_t_pct_chg
0 2 2 4 256 0.004447 0.000238 NaN NaN
1 3 2 9 2916 0.033335 0.000569 649.551704 138.733327
2 4 2 16 16384 0.101915 0.001271 205.725524 123.537768
3 5 2 25 62500 0.143263 0.002748 40.571573 116.194569
4 6 2 36 186624 0.198543 0.006185 38.586098 125.084930
5 7 2 49 470596 0.431762 0.011818 117.465543 91.083902
6 8 2 64 1048576 1.275459 0.022733 195.407745 92.349527
7 9 2 81 2125764 6.076753 0.039182 376.436524 72.357797
8 10 2 100 4000000 32.342393 0.066873 432.231473 70.674632
9 11 2 121 7086244 180.278205 0.109162 457.405282 63.236688
Clearly these implementations are following different growth curves. Kanevsky's algorithm should be growing at $\mathcal{O}(2^{k} n^3)$. NetworkX's growth becomes geometric (or worse).
The cause seems to be for some graph sizes, the number of members of the antichains(L) set explodes---billions of members, in some case---but I have not yet figured out why that happens.
Originally posted by @Cerebus in https://github.com/networkx/networkx/discussions/7989#discussioncomment-12923983
I think this is a bug:
>>> import networkx as nx
>>> import igraph as ig
>>> g = nx.Graph([(1, 2), (2, 3)])
>>> list(nx.all_node_cuts(g))
[{2}]
>>> ig.GraphBase.minimum_size_separators(ig.Graph.from_networkx(g))
[[1]]
>>> g = nx.Graph([(1, 2)])
>>> list(nx.all_node_cuts(g))
[{1}, {2}]
>>> ig.GraphBase.minimum_size_separators(ig.Graph.from_networkx(g))
[]
There's no vertex cut between neighboring nodes.
@dschult
Also, this seems weird:
>>> g = nx.Graph()
>>> g.add_node(1)
>>> list(nx.all_node_cuts(g))
[set()]
>>> ig.GraphBase.minimum_size_separators(ig.Graph.from_networkx(g))
[]
There have been a number of corner case problems with the all_node_cuts function over the years: #6558, #3039, #1976
The PR #3039 represented a substantial change that corrected "multiple incorrect implementation in the original code " and also slowed down the code by quite a bit.
The two cases you mention above (node_cuts for very small graphs) seem like corner cases, but could be indicative of mistakes in the larger implementation. They erode confidence in the rest of the code too. Looks like we need a deep dive into this algorithm and implementation.
I'm not able to reproduce this issue. This test passes:
def test_repro():
G = nx.Graph([(1, 2)])
assert nx.density(G) == 1
assert list(nx.all_node_cuts(G)) == []
all_node_cuts should exit early here:
# Address some corner cases first.
# For complete Graphs
if nx.density(G) == 1:
yield from ()
return
What value do you get when you call nx.density(G)?
I think the first comment above (with maybe bugs) runs OK on my machine -- as also reported by @amcandio
>>> import networkx as nx
>>> import igraph as ig
>>> g = nx.Graph([(1, 2), (2, 3)])
>>> list(nx.all_node_cuts(g))
[{2}]
>>> ig.GraphBase.minimum_size_separators(ig.Graph.from_networkx(g))
[[1]]
>>> g = nx.Graph([(1, 2)])
>>> list(nx.all_node_cuts(g))
[]
>>> ig.GraphBase.minimum_size_separators(ig.Graph.from_networkx(g))
[]
I think the result [{2}] vs [[1]] actually agrees between nx and ig because, IIUC, they index their nodes by integers starting at 0. So the nx "2" is an ig "1". and a set vs a list is an API difference I assume...
I do see the result in the following comment that a one node graph reports an empty set as a node cut. I'm not sure what a return value of an empty set would even mean -- I guess you might get that when the original graph is not connected... definitely strange...
I put up a property-based test:
def _random_connected_graph(n, prob, seed):
return nx.compose(
nx.random_labeled_tree(n, seed=seed),
nx.erdos_renyi_graph(n, prob, seed=seed),
)
@settings(print_blob=True)
@given(
graph_st(
_random_connected_graph,
n=st.integers(5, 10),
prob=st.floats(0, 1)
)
)
def test_random_connected_graph(G):
for flow_func in flow_funcs:
kwargs = {"flow_func": flow_func}
edge_cut = nx.minimum_edge_cut(G, **kwargs)
H = G.copy()
H.remove_edges_from(edge_cut)
assert not nx.is_connected(H)
node_cut = nx.minimum_node_cut(G, **kwargs)
H = G.copy()
H.remove_nodes_from(node_cut)
assert not nx.is_connected(H), f"Failed for {flow_func.__name__} with {G}"
Hypothesis is able to find counter-examples. We seem to be strugling with complete graphs:
> assert not nx.is_connected(H), f"Failed for {flow_func.__name__} with {G}"
E AssertionError: Failed for boykov_kolmogorov with Graph with 5 nodes and 10 edges
E assert not True
E + where True = <function is_connected at 0x7f969999f9c0>(<networkx.classes.graph.Graph object at 0x7f9699dd5850>)
E + where <function is_connected at 0x7f969999f9c0> = nx.is_connected
E Falsifying example: test_random_connected_graph(
E G=Graph()
E G.add_node(0)
E G.add_node(1)
E G.add_node(2)
E G.add_node(3)
E G.add_node(4)
E G.add_edge(0, 1)
E G.add_edge(0, 2)
E G.add_edge(0, 3)
E G.add_edge(0, 4)
E G.add_edge(1, 2)
E G.add_edge(1, 3)
E G.add_edge(1, 4)
E G.add_edge(2, 3)
E G.add_edge(2, 4)
E G.add_edge(3, 4)
E ,
E )
E Explanation:
E These lines were always and only run by failing examples:
E /home/amcandio/workspace/networkx/networkx/algorithms/connectivity/cuts.py:446
E /home/amcandio/workspace/networkx/networkx/algorithms/connectivity/cuts.py:447
E /home/amcandio/workspace/networkx/networkx/algorithms/connectivity/cuts.py:610
E /home/amcandio/workspace/networkx/networkx/classes/graph.py:436
E /home/amcandio/workspace/networkx/networkx/classes/reportviews.py:532
E
E You can reproduce this example by temporarily adding @reproduce_failure('6.131.9', b'AEEFKD/wAAAAAAAAQQBBAEEA') as a decorator on your test case
For this graph we detect {1, 2, 3, 4} as cut, which is wrong