Unfair matching
While looking for a minimal example that triggers a regression introduced by my patch in #1102, I ran into another issue (as always...) with our instantiation strategy. A relatively short example:
type 'a t
type a
logic h : 'a t -> ('a t) t
logic id : 'a t
logic rename : a t, a t -> a t
logic f : a t -> a t
logic p : a t, a t, a t -> a t
(* axiom bad : h(id) = id *)
axiom good1 :
forall x:a t. rename(x, id) = x
axiom good2 :
forall x:a t.
forall i:a t.
p(x, i, x) = f(p(x, i, rename(x, id)))
logic x : a t
goal g : p(x, id, x) = f(p(x, id, x))
(It's not easy to simplify further this example, I tried hard).
If you run AE on this file, you'll got a Valid answer quickly, and indeed, the goal is valid.
But if you uncomment the bad axiom, AE loops infinitely, producing new monomorphization of the term id with types a t, a t t, a t t t, ....
It's a bit sad that a completely useless axiom produces an infinitely loop so easily.
-
While generating new multi-triggers, Alt-Ergo filters the multi-triggers to limit their number. In particular, AE produces at most
nb_triggersmulti-triggers per axiom wherenb_triggersis a field of the matching environment (defined inUtil.ml). The initial value ofnb_triggersis given byOptions.get_nb_triggers ()and its default value is2. -
The limit
nb_triggersis adaptive and can be increased with a factor depending of the matching mode (normal,greedyandgreedier). In fact, this limit increases if the solver cannot progress after some matching rounds.
Before any filtering, the multi-triggers generated by AE for the axioms are:
- For
bad, we got the multi-triggersidandh(id). - For
good1, we gotxandrename(x, id). - For
good2, we goti,x,rename(x, id),p(x, i, x),f(p(x, i, rename(x, id))).
Now AE removes the variables and keep only two multi-triggers, so we got:
- For
bad, we got the multi-triggersh(id)andid. - For
good1, we gotrename(x, id). - For
good2, we gotrename(x, id),f(p(x, i, rename(x, id))).
Unfortunately, AE threw the crucial multi-trigger p(x, i, x). Since it keeps the multi-trigger id, the matching will progress with useless new terms and so the adaptive limit nb_triggers will never increase. If you call AE with the option --nb_triggers 3, it succeeds!
Another way to solve the issue consists in removing the ite in the Auto strategy (see Satml_frontend).
| Auto ->
List.fold_left
(fun updated (mconf, debug, greedy_round, frugal) ->
if updated then updated
(* TODO: stop here with an exception *)
else
let sa, inst_quantif =
instantiation_context env ~greedy_round ~frugal in
do_instantiation env sa inst_quantif mconf debug ~dec_lvl
)
false
(match Options.get_instantiation_heuristic () with
| INormal ->
[ frugal_mconf (), "frugal-inst", false, true ;
normal_mconf (), "normal-inst", false, false ]
| IAuto ->
[ frugal_mconf (), "frugal-inst", false, true ;
normal_mconf (), "normal-inst", false, false;
greedier_mconf (), "greedier-inst", true, false]
| IGreedy ->
[ greedy_mconf (), "greedy-inst", true , false;
greedier_mconf (), "greedier-inst", true, false])
In my opinion there are two issues here:
- The fact that a polymorphic constant like
idcan flood the SAT solver with useless monomorphization of itself isn't a desired behaviour. - We limit the number of multi-triggers to limit the number of instantiations per axiom during a matching round. I believe we should replace it by a limitation on the number of instantiations in the module
Matching. So we could keep all the triggers generated by Alt-Ergo, sort them with an appropriate weight function and try to match them in this order until we reach the maximal number of instantiations. I don't know if we should use a different limitation per axiom or the same one for all of them.
Does the problem also happen with the Tableaux solver? I remember that it seemed to behave better wrt instantiation in some tests I made while looking at the differences a while back.
Limiting the number of triggers per axiom is indeed a bit weird, and your proposed solution of limiting the number of instantiations per axiom looks better, but the devil's in the details (for instance we should probably do some sort of round-robin strategy on the triggers we use for a given axiom, to ensure all triggers ultimately have a chance to be checked in some later matching round). It is likely a deep rabbit hole that'll lead to a game of whack-a-mole with regressions, so it's probably safer to explore post-release (realistically as part of the matching/instantiation cleanup that is on the TODO list).
for instance we should probably do some sort of round-robin strategy on the triggers we use for a given axiom, to ensure all triggers ultimately have a chance to be checked in some later matching round
It's exactly what I had in mind but I didn't find a way to explain it. Basically, we have to be fair with both triggers and axioms.
realistically as part of the matching/instantiation cleanup that is on the TODO list
Don't worry. I don't plan to work on it before the release. I open this issue as a remainder. Actually, I use the mechanism of GitHub to pin issues to releases.
Does the problem also happen with the Tableaux solver? I remember that it seemed to behave better wrt instantiation in some tests I made while looking at the differences a while back.
Good catch! It works with the option --sat-solver Tableaux.
Good catch! It works with the option
--sat-solver Tableaux.
Ha! That confirms an intuition I had that we should be looking precisely at the difference in the way the Tableaux and CDCL solver do matching/instantiation and see if we can reproduce the behavior of the Tableaux solver in the CDCL solver.
(For context, after #1041 I expected that disabling dynamic variable ordering in the CDCL-Tableaux solver would provide identical behavior to the Tableaux solver, but it was not the case in particular due to different instantiation strategies. I have hopes we can reduce the gap between the two solvers by looking at the difference between their behavior wrt instantiation and having the CDCL-Tableaux solver match the behavior of the Tableaux solver — unless it drastically slows the solver, which is a possibility)