alt-ergo icon indicating copy to clipboard operation
alt-ergo copied to clipboard

Enable prelude by default

Open bclement-ocp opened this issue 2 years ago • 4 comments

Or solve the perf issues.

bclement-ocp avatar Oct 24 '23 14:10 bclement-ocp

The performance issue is as follow: The branch v2.5.x is slightly slower on problems of the ae-format dataset when we enable the theory fpa. More precisely for the unsat output:

Prover Alt-Ergo v2.5.x with --enable-theories fpa Alt-Ergo v2.5.x
number 51561 51548
mean time 0.347s 0.335s
deviation time 0.792s 0.801s
total time 4h29m14s 4h16m55s

We also observe that Alt-Ergo proved +14 tests with --enable-theories fpa and +4 tests without this flag.

Halbaroth avatar Oct 25 '23 11:10 Halbaroth

The total time above is for all problems, not only unsat problems I believe?

bclement-ocp avatar Oct 25 '23 12:10 bclement-ocp

~~It's a bug in my PR for benchpress. I'll fix it right now :)~~ It's a bit misleading but the current total_time outputs by benchpress is the total time for unsat + sat problems only. In our case, total time for unsat and total_time agree as we never output sat.

Halbaroth avatar Oct 25 '23 13:10 Halbaroth

EDIT: I made a dumb typo. The average don't look right, but the slowdown is always around 4%.

Ah, I seemed to recall different numbers. But there is still a discrepancy:

  • 4h29m14s is 16154s and 16154/51561 = 0.313 (not 0.303 as I initially wrote).
  • 4h16m55s is 15415s and 15415/51548 = 0.300.

The total time figure shows around 4% slowdown (0.313/0.300 ≈ 1.043, similar calculation on the rather meaningless 15614/15415 ≈ 1.048). The mean time figure shows around 3.6% slowdown (0.347/0.335 ≈ 1.036, similar calculation on the rather meaningless 0.347*51561/0.335*51548 ≈ 1.036)~, which is significantly larger~.

Which one should we trust? Although in both cases the slowdowns are on the very high side of what I'd consider acceptable.

(Although in both cases we are kind of comparing apples to oranges — we should probably either compare the median rather than the mean, or the mean over the problems solved by both solvers, because we don't want to advantage a solver that solves very few problems quickly. Here we have so few problems solved by a single one that it probably doesn't matter much, however)

bclement-ocp avatar Nov 13 '23 08:11 bclement-ocp