Weighted directed label propagation gets stuck (or is very slow)
This was found by the fuzzer. Label propagation running on directed weighted graphs can time out even on some small graphs. It gets stuck in some part of the code that is not even interruptible.
https://oss-fuzz.com/testcase-detail/5755444401340416 https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=67840
We need to distill this down to a manageable size testcase.
@vtraag Would you be able to look at this if I produce a small testcase that is in a convenient format?
This is a small example that never completes (likely goes in an infinite loop). Weight choice is critical for this to work.
#include <igraph.h>
int main(void) {
igraph_vector_t weights;
igraph_vector_int_t membership;
igraph_t graph;
igraph_small(&graph, 0, IGRAPH_DIRECTED,
0, 1, 2, 3, 4, 5, 6, 7, 5, 8, 1, 4, 3, 5, 9, 1, 7, 4, 8, 0, 9, 5, 6, 1, 2, 4,
-1);
igraph_vector_init_int_end(&weights, -1,
19, 46, 30, 2, 56, 52, 16, 50, 48, 2, 42, 51, 58,
-1);
igraph_vector_int_init(&membership, 0);
igraph_community_label_propagation(&graph, &membership, IGRAPH_OUT, &weights, NULL, NULL, IGRAPH_LPA_DOMINANCE);
igraph_vector_int_destroy(&membership);
igraph_vector_destroy(&weights);
igraph_destroy(&graph);
return 0;
}
A visualization of the graph with edge weights:
SCCs:
I'm guessing that the labels must be running around in circles indefinitely. Something about the graph structure and choice of weights makes it so that the probability of reaching a stable state is very low, or perhaps it is plainly impossible to have a stable state.
@vtraag Do you know if directed label propagation has been studied or whether this is an ad-hoc extension in igraph? Are there any results on whether the propagation procedure will stabilize in the directed case? I fear that fixing this may not be feasible ... Perhaps the best approach is to make the function interruptible and document it as a potential issue? I'll leave it to you to investigate and make a decision since you've recently worked with label propagation.
I'll look at it later. Perhaps @lovre has some useful input?
I'm guessing that the labels must be running around in circles indefinitely.
As far as I know this is a possibility when the labels are updated synchronously, but an easy way out is to update the labels asynchronously and to use a different, randomized update order in every cycle. But it seems like it doesn't work in this case; this is a potential cycle that I'm getting, but there are many others:
3 7 3 3 7 3 7 7 3 10
3 7 3 3 7 10 7 7 3 10
3 7 3 3 7 10 7 7 10 10
10 10 3 3 3 3 7 7 10 10
3 10 3 3 3 3 7 7 3 10
It looks like the algorithm is not able to reach a configuration where each vertex is set to the dominant color of its neighbors. I've let it run for a few seconds and then counted the occurrence of each configuration the output, yielding this (first number is the count), leaving out the initial transient:
49242 10 10 3 3 3 10 7 7 10 10
49198 10 10 3 3 3 3 7 7 10 10
49306 10 10 3 3 3 3 7 7 3 10
49095 10 10 3 3 7 10 7 7 10 10
49558 10 7 3 3 7 10 7 7 10 10
49286 3 10 3 3 3 3 7 7 3 10
49274 3 7 3 3 3 3 7 7 3 10
49166 3 7 3 3 7 10 7 7 10 10
49566 3 7 3 3 7 10 7 7 3 10
49272 3 7 3 3 7 3 7 7 3 10
I've realigned the columns to make it easier to see which vertices are changing.
There are two concerns with the current label propagation implementation:
- It is not interruptible at all -- I think we should simply place an
IGRAPH_ALLOW_INTERRUPTION()call in every 1000 iteration or so. - The implementation seems to call RNG_BEGIN() and RNG_END() for every iteration; I think these should be moved outside the main loop so they get called only once.
- The implementation seems to call RNG_BEGIN() and RNG_END() for every iteration; I think these should be moved outside the main loop so they get called only once.
The problem is that we also call vector_shuffle() within the main loop, which already includes an RNG_BEGIN/END pair. Such pairs shouldn't be nested, as they reduce randomness by resetting the random state unnecessarily.
Ideally, we could get rid of all the RNG_BEGIN/END for release 1.0, and move them to the R interface (CC @krlmlr )
The problem is that we also call vector_shuffle() within the main loop, which already includes an RNG_BEGIN/END pair.
aaah. This is something that should be fixed in 1.0 (if we decide not to get rid of these calls -- it might prove to be more tricky than expected). IMHO if we decide to keep these, then only top-level igraph functions that are not expected to be called by other top-level igraph functions should ever call RNG_BEGIN() and RNG_END(), so igraph_vector_shuffle() should not ever call it.
Perhaps best to wait to merge my fast label propagation PR (#2451) before diving in further?
I propose adding interruptibility for 0.10, then merging @vtraag's PR, and continuing work on the 1.0 branch. If we find a good fix, it can always be backported to 0.10 if need be.
I added interruption support, can you please resolve conflicts and merge #2451, @vtraag, if you think it's ready?
Now that #2451 is merged, this is a good time to come back to this issue. Do you have any ideas here @vtraag ?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.
@vtraag, any suggestions on how to address this?
No particular direct suggestion. You did confirm that this particular problem still exists in the most recent version?
You did confirm that this particular problem still exists in the most recent version?
Yes. See https://github.com/igraph/igraph/issues/2561#issuecomment-2038169153, now updated for develop.