alt-ergo Enable prelude by default

Or solve the perf issues.

Oct 24 '23 14:10 bclement-ocp

The performance issue is as follow: The branch v2.5.x is slightly slower on problems of the ae-format dataset when we enable the theory fpa. More precisely for the unsat output:

Prover	Alt-Ergo v2.5.x with `--enable-theories fpa`	Alt-Ergo v2.5.x
number	51561	51548
mean time	0.347s	0.335s
deviation time	0.792s	0.801s
total time	4h29m14s	4h16m55s

We also observe that Alt-Ergo proved +14 tests with --enable-theories fpa and +4 tests without this flag.

Oct 25 '23 11:10 Halbaroth

The total time above is for all problems, not only unsat problems I believe?

Oct 25 '23 12:10 bclement-ocp

~~It's a bug in my PR for benchpress. I'll fix it right now :)~~ It's a bit misleading but the current total_time outputs by benchpress is the total time for unsat + sat problems only. In our case, total time for unsat and total_time agree as we never output sat.

Oct 25 '23 13:10 Halbaroth

EDIT: I made a dumb typo. The average don't look right, but the slowdown is always around 4%.

Ah, I seemed to recall different numbers. But there is still a discrepancy:

4h29m14s is 16154s and 16154/51561 = 0.313 (not 0.303 as I initially wrote).
4h16m55s is 15415s and 15415/51548 = 0.300.

The total time figure shows around 4% slowdown (0.313/0.300 ≈ 1.043, similar calculation on the rather meaningless 15614/15415 ≈ 1.048). The mean time figure shows around 3.6% slowdown (0.347/0.335 ≈ 1.036, similar calculation on the rather meaningless 0.347*51561/0.335*51548 ≈ 1.036)~, which is significantly larger~.

Which one should we trust? Although in both cases the slowdowns are on the very high side of what I'd consider acceptable.

(Although in both cases we are kind of comparing apples to oranges — we should probably either compare the median rather than the mean, or the mean over the problems solved by both solvers, because we don't want to advantage a solver that solves very few problems quickly. Here we have so few problems solved by a single one that it probably doesn't matter much, however)

Nov 13 '23 08:11 bclement-ocp