SetReplace Performance: Slow matching for large rules

Describe the bug The following takes more than 9 seconds even though there are only 8! = 40320 cases to enumerate:

In[] := WolframModel[{{1, 2}, {3, 3}, {2, 2}, {3, 1}, {1, 2}, {3, 3}, {2, 
    1}, {2, 3}} -> {}, {{0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 0}, {0, 
   0}, {0, 0}, {0, 0}}, Method -> "LowLevel"]

Expected behavior It should take at least 10 rule elements to make it run on the order of seconds.

Version (please complete the following information):

OS: macOS 10.15.2 (19C57)
Wolfram Language Version: "12.1.0 for Mac OS X x86 (64-bit) (December 21, 2019)"
SetReplace git sha: ba9234cd62794ef55cd9473320ec7eeb9473e21e

Jan 06 '20 04:01 maxitg

I experience slowness with this rule:

WolframModel[{
    {
     {5,6,9,8,5,6},
     {1,2},{2,1},{2,4},{4,2},{4,3},{3,4},{3,1},{1,3},
     {4,6},{6,4},{6,5},{5,6},{5,3},{3,5},
     {6,9},{9,6},{9,8},{8,9},{8,5},{5,8},
     {6,7},{7,6},{7,10},{10,7},{10,9},{9,10},
     {7,11},{11,7},{11,12},{12,11},{12,10},{10,12}
    } -> {
     {4,111,7,6,4,111,4},
     {7,111,4,6,7,111,7},
     {4,111,7,6,4,111},
     {1,2},{2,1},{2,4},{4,2},{4,3},{3,4},{3,1},{1,3},
     {4,6},{6,4},{6,5},{5,6},{5,3},{3,5},
     {6,9},{9,6},{9,8},{8,9},{8,5},{5,8},
     {6,7},{7,6},{7,10},{10,7},{10,9},{9,10},
     {7,11},{11,7},{11,12},{12,11},{12,10},{10,12},
     {4,111},{111,4},{111,7},{7,111}
    },
    {
     {6,7,10,9,6,7,6},
     {1,2},{2,1},{2,4},{4,2},{4,3},{3,4},{3,1},{1,3},
     {4,6},{6,4},{6,5},{5,6},{5,3},{3,5},
     {6,9},{9,6},{9,8},{8,9},{8,5},{5,8},
     {6,7},{7,6},{7,10},{10,7},{10,9},{9,10}
    } -> {
     {4,111,7,6,4,111,4},
     {1,2},{2,1},{2,4},{4,2},{4,3},{3,4},{3,1},{1,3},
     {4,6},{6,4},{6,5},{5,6},{5,3},{3,5},
     {6,9},{9,6},{9,8},{8,9},{8,5},{5,8},
     {6,7},{7,6},{7,10},{10,7},{10,9},{9,10},
     {4,111},{111,4},{111,7},{7,111}
    },
    {{1,2,3,4,1},{1,2},{2,1},{2,3},{3,2},{3,4},{4,3}}->
        {{11,12,2,1,11},{1,2},{2,1},{2,3},{3,2},{3,4},{4,3},{11,1},{1,11},{11,12},{12,11},{12,2},{2,12}}
},{
    {1,2,3,4,1},{2,3,4,1,2},{3,4,1,2,3},{4,1,2,3,4},
    {1,2,3,4,1,2},{2,3,4,1,2,3},{3,4,1,2,3,4},{4,1,2,3,4,1},
    {1,2},{2,1},{2,3},{3,2},{3,4},{4,3},{4,1},{1,4}
}
, 32, "FinalStatePlot"]

Apr 20 '20 01:04 freemin7

This runs in 2.5 seconds on my dual core laptop. Should this issue be closed?

Oct 12 '20 14:10 aokellermann

@aokellermann What version of Mathematica are you running? Also which of the example runs in 2.5 seconds?

Oct 12 '20 19:10 freemin7

@aokellermann What version of Mathematica are you running?

I'm on "12.1.0 for Linux x86 (64-bit) (February 19, 2020)"

Also which of the example runs in 2.5 seconds?

the one in the issue description. yours runs in 150 seconds on my computer

Oct 12 '20 20:10 aokellermann

This runs in 2.5 seconds on my dual core laptop. Should this issue be closed?

Do we completely understand the time complexity here? It seems to me it should be something like

In[] := 8! * (* there are 8! matches *)
        8 * (* we try matching each expression to each pattern in the rule input, see below *)
        8 * (* each match has length 8  *)
        Log[8!] (* each newly created match needs to be attempted to be put in the map *) // Round
Out[] = 27364966

So, if the above is correct, 2.5 seconds seems reasonable. However, I think we can get rid of the second line in the above, at least during the initial parsing of the init.

Note that we encounter each match 8 times here. That's because when we start match-finding recursions, we consider 8 * 8 combinations, each rule input * each init expression.

We enumerate rule inputs here:

https://github.com/maxitg/SetReplace/blob/2278ed1d916d6e07be5365a8191538878166169c/libSetReplace/Match.cpp#L325-L334

And expressions here:

https://github.com/maxitg/SetReplace/blob/2278ed1d916d6e07be5365a8191538878166169c/libSetReplace/Match.cpp#L336-L351

However, when processing the initial condition (and perhaps in some other cases), we don't need to consider each rule input. Since we are enumerating all expressions anyway, and each match will have to match some of these expressions to the first rule input, we will find all matches if we replace the first function above to:

void addMatchesForRule(const std::vector<ExpressionID>& expressionIDs, 
                       const RuleID& ruleID, 
                       const std::function<bool()>& shouldAbort) { 
  const auto& ruleInputExpressions = rules_[ruleID].inputs; 
  const Match emptyMatch{ruleID, std::vector<ExpressionID>(ruleInputExpressions.size(), -1)}; 
  completeMatchesStartingWithInput( 
      emptyMatch, ruleInputExpressions, rules_[ruleID].eventSelectionFunction, 0, expressionIDs, shouldAbort); 
}

This should make this code ~8 times faster.

Oct 12 '20 20:10 maxitg