analyzer icon indicating copy to clipboard operation
analyzer copied to clipboard

Optimize and refactor "Hoare" domains

Open sim642 opened this issue 1 year ago • 11 comments

Related to issues #101 and #803.

TODO

  • [x] Benchmark on sv-benchmarks.
  • [x] Check performance on SQLite.
  • [x] Fix termination issue when SetDomain.Joined is used inside AddressDomain?
  • [x] Reimplement HoareDomain.MapBot for witness generation?
  • [x] Adapt to #808.
  • [x] Rename SensitiveDomain to BucketSetDomain or something?

sim642 avatar Aug 02 '22 07:08 sim642

SQLite

Analyzing https://www.sqlite.org/2022/sqlite-amalgamation-3370200.zip for ~1 minute with

goblint sqlite3.c sqlite3.h sqlite3ext.h shell.c -v --set pre.cppflags[+] -DSQLITE_DEBUG --disable ana.base.context.non-ptr --disable sem.unknown_function.spawn --set ana.thread.domain plain --enable exp.earlyglobs --set ana.base.privatization none --set pre.cppflags[+] -DGOBLINT_NO_BSEARCH --set pre.cppflags[+] -DGOBLINT_NO_ASSERT --set result json-messages

(Same version and arguments as in #803.)

Before

At 2f5afc2c23fa12bbc1572f3fef43b5fa8cc0fa5e:

vars = 42983    evals = 216906  

Timings:
TOTAL                          63.343 s
  parse                           0.449 s
  convert to CIL                  0.586 s
  mergeCIL                        0.156 s
  analysis                       62.154 s
    createCFG                       0.964 s
      handle                          0.467 s
      iter_connect                    0.433 s
        computeSCCs                     0.217 s
    global_inits                    0.363 s
    solving                        60.793 s
      S.Dom.equal                     0.235 s
Timing used
Memory statistics: total=191160.42MB, max=534.03MB, minor=191085.51MB, major=1361.80MB, promoted=1286.89MB
    minor collections=91130  major collections=20 compactions=0

After

At 6f886157414c99ec292998efe45cb7d5b7219ec5:

vars = 59925    evals = 299226  

Timings:
TOTAL                          64.871 s
  parse                           0.482 s
  convert to CIL                  0.574 s
  mergeCIL                        0.158 s
  analysis                       63.660 s
    createCFG                       0.983 s
      handle                          0.487 s
      iter_connect                    0.433 s
        computeSCCs                     0.217 s
    global_inits                    0.376 s
    solving                        62.257 s
      S.Dom.equal                     0.296 s
Timing used
Memory statistics: total=182390.45MB, max=614.13MB, minor=182315.17MB, major=1476.96MB, promoted=1401.67MB
    minor collections=86947  major collections=20 compactions=0

Change

This is an improvement of 39.4% for vars and 38.0% for evals!

sim642 avatar Aug 02 '22 13:08 sim642

Impressive! Could you check if a short-circuiting equal still leads to further improvements here? If yes, one probably wants both.

michael-schwarz avatar Aug 02 '22 13:08 michael-schwarz

Impressive! Could you check if a short-circuiting equal still leads to further improvements here? If yes, one probably wants both.

Prepending x == y || to equal of both Projective and Pairwise gives vars = 58865 evals = 290541, so looks like no clear difference.

sim642 avatar Aug 02 '22 13:08 sim642

Prepending x == y || to equal of both Projective and Pairwise gives vars = 58865 evals = 290541, so looks like no clear difference.

It might be interesting to leave it running for a while longer, one would expect the shortcut to start paying off as soon as address sets become large.

michael-schwarz avatar Aug 02 '22 16:08 michael-schwarz

It might be interesting to leave it running for a while longer, one would expect the shortcut to start paying off as soon as address sets become large.

It could also be the AD.fold/AD.add loops that end up copying and not using a physically equal instance.

sim642 avatar Aug 02 '22 17:08 sim642

sv-benchmarks

Before vs after with SetDomain.Joined for AddressDomain buckets

image (Full results)

Overall it's a 6% speedup. Most of the noisy blobs that show some slowdown are ReachSafety-ECA tasks.

ldv

Filtering for ldv tasks, the speedup is 26% without any slowdown outliers: image

SetDomain.Joined vs HoareDomain.Set2 for AddressDomain buckets

Overall, using a true Hoare set for address buckets instead of force-joining them all together doesn't make a difference to performance and precision (Comparison plot). In a handful of cases (~46), it changes a true result to TIMEOUT (Table, last two columns). So it's somehow possible to saturate the Hoare sets without widening taking care of it. Thus, it's probably better to stay with Joined now, which also matches the old behavior and avoids compatibility problems with #808.

sim642 avatar Aug 03 '22 07:08 sim642

I ran Goblint with short-circuit evalution for the SensitiveDomain.Projective.equal (branch master_hoaredomain-new_short_circuit_equal) and without it (branch master_hoaredomain-new) on sqlite. For the diff, see here.

Goblint command line
time ./goblint_master_hoaredomain-new_short_circuit_equal ../../benchmarks/sqlite-amalgamation-3370200/sqlite3.c ../../benchmarks/sqlite-amalgamation-3370200/sqlite3.h ../../benchmarks/sqlite-amalgamation-3370200/sqlite3ext.h ../../benchmarks/sqlite-amalgamation-3370200/shell.c -v --set pre.cppflags[+] -DSQLITE_DEBUG --disable ana.base.context.non-ptr --disable ana.int.def_exc --disable sem.unknown_function.spawn --set ana.thread.domain plain --enable exp.earlyglobs --set ana.base.privatization none --set pre.cppflags[+] -DGOBLINT_NO_BSEARCH --set pre.cppflags[+] -DGOBLINT_NO_ASSERT --set result json-messages --set ana.activated "['base', 'mallocWrapper']" --set ana.ctx_insens[+] base --set ana.ctx_insens[+] mallocWrapper

The runtime with with the short-circuit evaluation was 42955.290s, and 43475.717s without it. That makes a difference of 520seconds = 8min40s. The version without short-circuit evaluation was 1.2% slower than the version with short-circuiting.

jerhard avatar Aug 05 '22 08:08 jerhard

The runtime with with the short-circuit evaluation was 42955.290s, and 43475.717s without it. That makes a difference of 520seconds = 8min40s. The version without short-circuit evaluation was 1.2% slower than the version with short-circuiting.

That's nice, I'll add it here then. The difference only becomes apparent at such long scale instead of a 1 minute benchmark.

sim642 avatar Aug 05 '22 09:08 sim642

Also adapting WitnessConstraints.PathSensitive3 to a similar construction from 8d6b8cff1 to cde436f39 gives a huge speedup in some cases (mostly those ECA tasks): image Full results

Somehow it also makes us unsound on 3 of our own tasks (fixed in 9e52524ddd82f80badfb2ac967c5794c9bbc4cf2):

  • [x] goblint-regression/28-race_reach_06-cond_racing1
  • [x] goblint-regression/28-race_reach_23-sound_unlock_racing
  • [x] goblint-regression/28-race_reach_24-sound_lock_racing

sim642 avatar Aug 09 '22 07:08 sim642

To avoid hard problems related to #816, it's possible to just use a partitioning which keeps a, a[0] and a.fst strictly separate. This implies a constant number of additional elements and work in the rare cases an address set indeed contains multiple of these. According to an sv-benchmarks run, there's no impact to performance or results. If anything, this makes it possible to implement AddressDomain just using the very effective ProjectiveSet and without any (nested) PairwiseSet, since the partitions have unique computable representatives (same address but with all indices replaced with top).

sim642 avatar Aug 11 '22 13:08 sim642

I hunted down the reason for these new unsoundnesses in our own race_reach benchmarks. It was a very subtle misplacement of try. The latest run on sv-benchmarks fixes those three, but still leaves one additional unsound benchmark (in addition to the handful already present on master) which I believe has an incorrect expected verdict: https://gitlab.com/sosy-lab/benchmarking/sv-benchmarks/-/merge_requests/1364. In SV-COMP 2022 we answered unknown for it and on master we get the same unreachability but an ERROR (verify) hid our unsound claim.

sim642 avatar Aug 22 '22 14:08 sim642

Does this PR also attempt to fix tests 02/{91, 92, 92} that are also closely related to the address domain?

michael-schwarz avatar Sep 01 '22 16:09 michael-schwarz

Does this PR also attempt to fix tests 02/{91, 92, 92} that are also closely related to the address domain?

Not at all. On the contrary, it makes the problem even worse since previously a and a[0] were more-or-less handled equivalently but now they're split into separate buckets. Same with the a and a.fst pairing. So this makes #822 even more obvious and urgent.

sim642 avatar Sep 02 '22 06:09 sim642

On the contrary, it makes the problem even worse since previously a and a[0] were more-or-less handled equivalently but now they're split into separate buckets. Same with the a and a.fst pairing.

I just added the tests for these easier cases as well, which aren't listed in #822. Turns out that the first field one actually accidentally still works due to some of those special cases in lval domain still being present (and thus considering the two buckets, which are represented by themselves, equal). It doesn't work for the zero index since there the index gets turned into top for the bucket representative.

sim642 avatar Sep 02 '22 08:09 sim642