Enable prelude by default
Or solve the perf issues.
The performance issue is as follow:
The branch v2.5.x is slightly slower on problems of the ae-format dataset when we enable the theory fpa. More precisely for the unsat output:
| Prover | Alt-Ergo v2.5.x with --enable-theories fpa |
Alt-Ergo v2.5.x |
|---|---|---|
| number | 51561 | 51548 |
| mean time | 0.347s | 0.335s |
| deviation time | 0.792s | 0.801s |
| total time | 4h29m14s | 4h16m55s |
We also observe that Alt-Ergo proved +14 tests with --enable-theories fpa and +4 tests without this flag.
The total time above is for all problems, not only unsat problems I believe?
~~It's a bug in my PR for benchpress. I'll fix it right now :)~~
It's a bit misleading but the current total_time outputs by benchpress is the total time for unsat + sat problems only. In our case, total time for unsat and total_time agree as we never output sat.
EDIT: I made a dumb typo. The average don't look right, but the slowdown is always around 4%.
Ah, I seemed to recall different numbers. But there is still a discrepancy:
4h29m14sis16154sand16154/51561 = 0.313(not0.303as I initially wrote).4h16m55sis15415sand15415/51548 = 0.300.
The total time figure shows around 4% slowdown (0.313/0.300 ≈ 1.043, similar calculation on the rather meaningless 15614/15415 ≈ 1.048). The mean time figure shows around 3.6% slowdown (0.347/0.335 ≈ 1.036, similar calculation on the rather meaningless 0.347*51561/0.335*51548 ≈ 1.036)~, which is significantly larger~.
Which one should we trust? Although in both cases the slowdowns are on the very high side of what I'd consider acceptable.
(Although in both cases we are kind of comparing apples to oranges — we should probably either compare the median rather than the mean, or the mean over the problems solved by both solvers, because we don't want to advantage a solver that solves very few problems quickly. Here we have so few problems solved by a single one that it probably doesn't matter much, however)