apalache
apalache copied to clipboard
Non-deterministic solver runtimes
When running @nano-o's spec for 6 processes, I noticed considerable fluctuations (up to 4x slowdown) in Apalache runtime.
Command:
apalache-mc check --run-dir=out --debug --init=Correctness_ --inv=Correctness --length=1 ~/src/Distributed-termination-detection/Termination.tla
Spec diff for 6 processes:
diff --git a/Termination.tla b/Termination.tla
index 5ebcf77..28f6a20 100644
--- a/Termination.tla
+++ b/Termination.tla
@@ -48,8 +48,8 @@ EXTENDS Integers, FiniteSets, Apalache, Sequences
\* @type: Set(P);
\* P == {"P1_OF_P", "P2_OF_P", "P3_OF_P"}
-P == {"P1_OF_P", "P2_OF_P", "P3_OF_P", "P4_OF_P"}
-\* P == {"P1_OF_P", "P2_OF_P", "P3_OF_P", "P4_OF_P", "P5_OF_P"}
+\* P == {"P1_OF_P", "P2_OF_P", "P3_OF_P", "P4_OF_P"}
+P == {"P1_OF_P", "P2_OF_P", "P3_OF_P", "P4_OF_P", "P5_OF_P", "P6_OF_P"}
When calling Z3 through the API, it's internal statistics (z3solver.getStatistics
) show non-deterministic behavior in both :time
and :restarts
.
On 5 Apalache runs, Z3 :restarts
range from 26 to 41. :time
for the second-to-last SMT call ranges from 327 to 1811 seconds:
apalache run 1 | apalache run 2 | apalache run 3 | apalache run 4 | apalache run 5 | |
---|---|---|---|---|---|
:time | |||||
sat call 1 | 144.80 | 143.26 | 143.26 | 144.91 | 142.78 |
sat call 2 | 8.24 | 8.29 | 8.29 | 8.37 | 8.21 |
sat call 3 | 73.59 | 42.20 | 42.20 | 42.36 | 71.57 |
sat call 4 | 2.91 | 1.80 | 1.80 | 1.83 | 2.90 |
sat call 5 | 2.13 | 1.30 | 1.30 | 1.92 | 91.86 |
sat call 6 | 1811.14 | 417.34 | 417.34 | 327.39 | 784.24 |
sat call 7 | 12.74 | 10.71 | 10.71 | 7.41 | 8.93 |
:restarts | |||||
sat call 1 | 10 | 10 | 10 | 10 | 10 |
sat call 2 | 10 | 10 | 10 | 10 | 10 |
sat call 3 | 14 | 13 | 13 | 13 | 14 |
sat call 4 | 15 | 13 | 13 | 13 | 15 |
sat call 5 | 16 | 13 | 13 | 13 | 17 |
sat call 6 | 29 | 25 | 25 | 25 | 40 |
sat call 7 | 29 | 26 | 26 | 26 | 41 |
OTOH, running Z3 from the shell on the dumped log0.smt
looks to be completely deterministic:
$ for i ($(seq 1 14)) ; do
z3 -st sat.random_seed=0 smt.random_seed=0 fp.spacer.random_seed=0 sls.random_seed=0 ~/log0.smt | egrep 'time|restarts'
done
:restarts 31
:time 11.35
:total-time 585.41)
:restarts 31
:time 11.16
:total-time 584.54)
:restarts 31
:time 11.10
:total-time 581.18)
:restarts 31
:time 11.08
:total-time 568.50)
:restarts 31
:time 11.10
:total-time 571.24)
z3 binary matches our z3-turnkey dependency version 4.8.17
.
Even the offline solver context suffers from this issue.
Also, looks like the random seeds are set by default, so I doubt that is the issue:
$ python3 -c "import z3; print(z3.get_param('smt.random_seed'))"
0
I verified that the SMT constraints produced (using Z3Solver.toString()
) are identical across different runs of Apalache.
Can it be that the memory layout of different runs affects some heuristics of z3? I guess, when you run it in CLI, you always get the same memory allocator. When z3 is run as a library inside JVM, malloc
is probably less deterministic.
I think it's down to tactics or something related.
AFAICT, the Z3 shell constructs the solver instance lazily only once it encounters a (sat)
, (push)
or (reset)
:
https://github.com/Z3Prover/z3/blob/e2f4fc23076032935daa85e305d0453dc1470d73/src/cmd_context/cmd_context.cpp#L787-L788
The API invokes the solver factory much earlier, as early as the first Z3_solver_assert()
:
https://github.com/Z3Prover/z3/blob/78eaefe5a8be7b24761ad829682c603865458a66/src/api/api_solver.cpp#L473-L477
https://github.com/Z3Prover/z3/blob/78eaefe5a8be7b24761ad829682c603865458a66/src/api/api_solver.cpp#L142-L147
Still present if we use mkSimpleSolver
instead of mkSolver
.
Seems like it's our constraint output after all! We sometimes permute disjunctions:
--- bench.smt 2022-08-31 08:51:17.000000000 +0200
+++ _apalache-out/Termination.tla/2022-09-02T15-49-04_14311296710629792029/log0.smt 2022-09-02 19:12:59.000000000 +0200
; [START] Caching equality constraints for a sequence: List(($C$14,$C$246), ($C$13,$C$246), ($C$12,$C$246), ($C$16,$C$246), ($C$11,$C$246), ($C$15,$C$246))
; [DONE] Caching equality constraints
-;; assert (IF (($C$243 = $C$215) ∨ ($C$243 = $C$209) ∨ ($C$243 = $C$233) ∨ ($C$243 = $C$239) ∨ ($C$243 = $C$221) ∨ ($C$243 = $C$227)) THEN ($C$246 = $C$14) ELSE TRUE) ∧ (IF (($C$243 = $C$220) ∨ ($C$243 = $C$208) ∨ ($C$243 = $C$214) ∨ ($C$243 = $C$226) ∨ ($C$243 = $C$238) ∨ ($C$243 = $C$232)) THEN ($C$246 = $C$13) ELSE TRUE) ∧ (IF (($C$243 = $C$225) ∨ ($C$243 = $C$219) ∨ ($C$243 = $C$231) ∨ ($C$243 = $C$207) ∨ ($C$243 = $C$213) ∨ ($C$243 = $C$237)) THEN ($C$246 = $C$12) ELSE TRUE) ∧ (IF (($C$243 = $C$235) ∨ ($C$243 = $C$241) ∨ ($C$243 = $C$223) ∨ ($C$243 = $C$229) ∨ ($C$243 = $C$217) ∨ ($C$243 = $C$211)) THEN ($C$246 = $C$16) ELSE TRUE) ∧ (IF (($C$243 = $C$206) ∨ ($C$243 = $C$236) ∨ ($C$243 = $C$230) ∨ ($C$243 = $C$218) ∨ ($C$243 = $C$212) ∨ ($C$243 = $C$224)) THEN ($C$246 = $C$11) ELSE TRUE) ∧ (IF (($C$243 = $C$240) ∨ ($C$243 = $C$216) ∨ ($C$243 = $C$222) ∨ ($C$243 = $C$228) ∨ ($C$243 = $C$210) ∨ ($C$243 = $C$234)) THEN ($C$246 = $C$15) ELSE TRUE)
-(assert (and (ite (or (= $C$243 $C$215)
+;; assert (IF (($C$243 = $C$233) ∨ ($C$243 = $C$209) ∨ ($C$243 = $C$239) ∨ ($C$243 = $C$215) ∨ ($C$243 = $C$227) ∨ ($C$243 = $C$221)) THEN ($C$246 = $C$14) ELSE TRUE) ∧ (IF (($C$243 = $C$232) ∨ ($C$243 = $C$220) ∨ ($C$243 = $C$208) ∨ ($C$243 = $C$226) ∨ ($C$243 = $C$238) ∨ ($C$243 = $C$214)) THEN ($C$246 = $C$13) ELSE TRUE) ∧ (IF (($C$243 = $C$237) ∨ ($C$243 = $C$219) ∨ ($C$243 = $C$207) ∨ ($C$243 = $C$213) ∨ ($C$243 = $C$225) ∨ ($C$243 = $C$231)) THEN ($C$246 = $C$12) ELSE TRUE) ∧ (IF (($C$243 = $C$217) ∨ ($C$243 = $C$229) ∨ ($C$243 = $C$241) ∨ ($C$243 = $C$235) ∨ ($C$243 = $C$223) ∨ ($C$243 = $C$211)) THEN ($C$246 = $C$16) ELSE TRUE) ∧ (IF (($C$243 = $C$218) ∨ ($C$243 = $C$212) ∨ ($C$243 = $C$236) ∨ ($C$243 = $C$224) ∨ ($C$243 = $C$230) ∨ ($C$243 = $C$206)) THEN ($C$246 = $C$11) ELSE TRUE) ∧ (IF (($C$243 = $C$228) ∨ ($C$243 = $C$222) ∨ ($C$243 = $C$240) ∨ ($C$243 = $C$216) ∨ ($C$243 = $C$234) ∨ ($C$243 = $C$210)) THEN ($C$246 = $C$15) ELSE TRUE)
+(assert (and (ite (or (= $C$243 $C$233)
(= $C$243 $C$209)
- (= $C$243 $C$233)
(= $C$243 $C$239)
- (= $C$243 $C$221)
- (= $C$243 $C$227))
+ (= $C$243 $C$215)
+ (= $C$243 $C$227)
+ (= $C$243 $C$221))
(= $C$246 $C$14)
true)
- (ite (or (= $C$243 $C$220)
+ (ite (or (= $C$243 $C$232)
+ (= $C$243 $C$220)
(= $C$243 $C$208)
- (= $C$243 $C$214)
(= $C$243 $C$226)
(= $C$243 $C$238)
- (= $C$243 $C$232))
+ (= $C$243 $C$214))
(= $C$246 $C$13)
true)
So far I've only seen this happen with Caching equality constraints for a sequence: List(...)
.
Wow. Good find!
Also, latest z3 does not seem to have this issue! (at least on this particular spec) I've requested a turnkey version bump here: https://github.com/tudo-aqua/z3-turnkey/issues/14
$ time z3-4.8.17-arm64-osx-10.16/bin/z3 -st sat.random_seed=0 smt.random_seed=0 fp.spacer.random_seed=0 sls.random_seed=0 bench.smt
577.04s user 2.61s system 99% cpu 9:40.59 total
$ time z3-4.8.17-arm64-osx-10.16/bin/z3 -st sat.random_seed=0 smt.random_seed=0 fp.spacer.random_seed=0 sls.random_seed=0 _apalache-out/Termination.tla/2022-09-02T15-49-04_14311296710629792029/log0.smt
2516.66s user 8.20s system 99% cpu 42:08.02 total # <---- !!!!
$ time z3-4.11.0-arm64-osx-10.16/bin/z3 -st sat.random_seed=0 smt.random_seed=0 fp.spacer.random_seed=0 sls.random_seed=0 bench.smt
696.23s user 2.63s system 99% cpu 11:40.47 total
$ time z3-4.11.0-arm64-osx-10.16/bin/z3 -st sat.random_seed=0 smt.random_seed=0 fp.spacer.random_seed=0 sls.random_seed=0 _apalache-out/Termination.tla/2022-09-02T15-49-04_14311296710629792029/log0.smt
510.98s user 2.01s system 99% cpu 8:34.06 total
This is amazing! We should probably learn from this how to prioritize constraints to improve performance!
Do you have a complete log? I think it is something right after the equality constraints were cached.
Likely caused by this line:
https://github.com/informalsystems/apalache/blob/a2e967e31684f958af0492a5387f21c6abac1b11/tla-bmcmt/src/main/scala/at/forsyte/apalache/tla/bmcmt/rules/aux/CherryPick.scala#L129
#2149 resolves the non-deterministic SMT output, but there's still quite some spread in overall runtime:
10:46 11:45 13:03 15:16 15:43 16:00 16:25 16:43 17:44 18:05 20:43 21:26 31:47 32:39
I will test whether it's fixed by just bumping Z3 as indicated in https://github.com/informalsystems/apalache/issues/2120#issuecomment-1235771144.
Still present with #2149 and Z3 bumped to the lastest release. I have a few remaining ideas to test:
- [x] We should print the constraints directly from
Solver.toString()
to ensure they are not jumbled internally. ensured they're equivalent - [x] Maybe it's just caused by running with
--debug
, which is IO heavy? nope - [x] Maybe it's caused by Z3 multi-threading. We can limit this via parameters. also no
- [x] We can set all parameters to the default values reported by
z3 -pd
. still all over the place
Fixing ZipOracle
(#2149) made the offline solver deterministic.
So the remaining issue must be in the incremental solver.
There's a few Z3 parameters for this in the combined_solver
module:
- [x] Try setting
combined_solver.ignore_solver1=true
in offline mode. still deterministic - [x] Try setting the logic in incremental mode. Setting to
QF_IDL
didn't help.
I'm out of ideas, so I will leave this be for the moment. Maybe it can be fixed by using assumptions (#2137).
I thought #2569 might've fixed this, but it's still present.