alt-ergo Unfair matching

While looking for a minimal example that triggers a regression introduced by my patch in #1102, I ran into another issue (as always...) with our instantiation strategy. A relatively short example:

type 'a t

type a

logic h : 'a t -> ('a t) t

logic id : 'a t

logic rename : a t, a t -> a t

logic f : a t -> a t

logic p : a t, a t, a t -> a t

(* axiom bad : h(id) = id *)

axiom good1 :
  forall x:a t. rename(x, id) = x

axiom good2 :
  forall x:a t.
  forall i:a t.
  p(x, i, x) = f(p(x, i, rename(x, id)))

logic x : a t

goal g : p(x, id, x) = f(p(x, id, x))

(It's not easy to simplify further this example, I tried hard).

If you run AE on this file, you'll got a Valid answer quickly, and indeed, the goal is valid. But if you uncomment the bad axiom, AE loops infinitely, producing new monomorphization of the term id with types a t, a t t, a t t t, ....

It's a bit sad that a completely useless axiom produces an infinitely loop so easily.

While generating new multi-triggers, Alt-Ergo filters the multi-triggers to limit their number. In particular, AE produces at most nb_triggers multi-triggers per axiom where nb_triggers is a field of the matching environment (defined in Util.ml). The initial value of nb_triggers is given by Options.get_nb_triggers () and its default value is 2.
The limit nb_triggers is adaptive and can be increased with a factor depending of the matching mode (normal, greedy and greedier). In fact, this limit increases if the solver cannot progress after some matching rounds.

Before any filtering, the multi-triggers generated by AE for the axioms are:

For bad, we got the multi-triggers id and h(id).
For good1, we got x and rename(x, id).
For good2, we got i, x, rename(x, id), p(x, i, x), f(p(x, i, rename(x, id))).

Now AE removes the variables and keep only two multi-triggers, so we got:

For bad, we got the multi-triggers h(id) and id.
For good1, we got rename(x, id).
For good2, we got rename(x, id), f(p(x, i, rename(x, id))).

Unfortunately, AE threw the crucial multi-trigger p(x, i, x). Since it keeps the multi-trigger id, the matching will progress with useless new terms and so the adaptive limit nb_triggers will never increase. If you call AE with the option --nb_triggers 3, it succeeds!

Another way to solve the issue consists in removing the ite in the Auto strategy (see Satml_frontend).

    | Auto ->
      List.fold_left
        (fun updated (mconf, debug, greedy_round, frugal) ->
           if updated then updated
           (* TODO: stop here with an exception *)
           else
             let sa, inst_quantif =
               instantiation_context env ~greedy_round ~frugal in
             do_instantiation env sa inst_quantif mconf debug ~dec_lvl
        )
        false
        (match Options.get_instantiation_heuristic () with
         | INormal ->
           [ frugal_mconf (), "frugal-inst", false, true ;
             normal_mconf (), "normal-inst", false, false ]
         | IAuto ->
           [ frugal_mconf (), "frugal-inst", false, true ;
             normal_mconf (), "normal-inst", false, false;
             greedier_mconf (), "greedier-inst", true, false]
         | IGreedy ->
           [ greedy_mconf (), "greedy-inst", true , false;
             greedier_mconf (), "greedier-inst", true, false])

May 02 '24 17:05 Halbaroth

In my opinion there are two issues here:

The fact that a polymorphic constant like id can flood the SAT solver with useless monomorphization of itself isn't a desired behaviour.
We limit the number of multi-triggers to limit the number of instantiations per axiom during a matching round. I believe we should replace it by a limitation on the number of instantiations in the module Matching. So we could keep all the triggers generated by Alt-Ergo, sort them with an appropriate weight function and try to match them in this order until we reach the maximal number of instantiations. I don't know if we should use a different limitation per axiom or the same one for all of them.

May 03 '24 06:05 Halbaroth

Does the problem also happen with the Tableaux solver? I remember that it seemed to behave better wrt instantiation in some tests I made while looking at the differences a while back.

Limiting the number of triggers per axiom is indeed a bit weird, and your proposed solution of limiting the number of instantiations per axiom looks better, but the devil's in the details (for instance we should probably do some sort of round-robin strategy on the triggers we use for a given axiom, to ensure all triggers ultimately have a chance to be checked in some later matching round). It is likely a deep rabbit hole that'll lead to a game of whack-a-mole with regressions, so it's probably safer to explore post-release (realistically as part of the matching/instantiation cleanup that is on the TODO list).

May 03 '24 07:05 bclement-ocp

for instance we should probably do some sort of round-robin strategy on the triggers we use for a given axiom, to ensure all triggers ultimately have a chance to be checked in some later matching round

It's exactly what I had in mind but I didn't find a way to explain it. Basically, we have to be fair with both triggers and axioms.

realistically as part of the matching/instantiation cleanup that is on the TODO list

Don't worry. I don't plan to work on it before the release. I open this issue as a remainder. Actually, I use the mechanism of GitHub to pin issues to releases.

Does the problem also happen with the Tableaux solver? I remember that it seemed to behave better wrt instantiation in some tests I made while looking at the differences a while back.

Good catch! It works with the option --sat-solver Tableaux.

May 03 '24 08:05 Halbaroth

Good catch! It works with the option --sat-solver Tableaux.

Ha! That confirms an intuition I had that we should be looking precisely at the difference in the way the Tableaux and CDCL solver do matching/instantiation and see if we can reproduce the behavior of the Tableaux solver in the CDCL solver.

(For context, after #1041 I expected that disabling dynamic variable ordering in the CDCL-Tableaux solver would provide identical behavior to the Tableaux solver, but it was not the case in particular due to different instantiation strategies. I have hopes we can reduce the gap between the two solvers by looking at the difference between their behavior wrt instantiation and having the CDCL-Tableaux solver match the behavior of the Tableaux solver — unless it drastically slows the solver, which is a possibility)

May 03 '24 08:05 bclement-ocp