analyzer icon indicating copy to clipboard operation
analyzer copied to clipboard

Improve HoarePO.equal and leq performance

Open jerhard opened this issue 3 years ago • 7 comments

Findings

@michael-schwarz and I did some runs of Goblint (master) on the SQLite amalgamation, and investigated performance using perf and the firefox-profiler. We looked at the high run time of (on my system) BaseDomain.fun_6129, which we conjectured to be BaseDomain.equal_basecomponents_t.

Further investigation with Stats.time on a ~1 minute run of SQLite showed that a significant amount of time is used in HoarePO.equal:

TOTAL                          55.185 s
  parse                           0.463 s
  convert to CIL                  0.568 s
  mergeCIL                        0.136 s
  analysis                       54.019 s
    createCFG                       0.836 s
      handle                          0.293 s
      iter_connect                    0.479 s
        computeSCCs                     0.217 s
    global_inits                    0.403 s
      HoareDomain.equal               0.001 s
    solving                        52.741 s
      S.Dom.equal                     0.199 s
      HoareDomain.equal              18.238 s
Timing used
Command line used for running Goblint on sqlite
/goblint ../sqlite-amalgamation-3370200/sqlite3.c ../sqlite-amalgamation-3370200/sqlite3.h ../sqlite-amalgamation-3370200/sqlite3ext.h ../sqlite-amalgamation-3370200/shell.c -v --set pre.cppflags[+] -DSQLITE_DEBUG --disable ana.base.context.non-ptr --disable sem.unknown_function.spawn --set ana.thread.domain plain --enable exp.earlyglobs --set ana.base.privatization none --set pre.cppflags[+] -DGOBLINT_NO_BSEARCH --set pre.cppflags[+] -DGOBLINT_NO_ASSERT --set result json-messages
We tried this a few times, and in the first minute of analyzing SQLite 30-50% of the runtime are spent in `HoarePO.equal`.

Problems With Current Implementation

The current implementation of HoarePO.equal relies on checking HoarePO.leq in both directions. The implementation of leq in turn uses

  • for_all, which transforms the argument Hoare-map to a list,
  • and mem, which also transforms its Hoare-map argument to a list. The function mem may be called for each element in a Hoare-map.

It thus seems like the current implementation could be optimized considerably.

Goals

The implementation of HoarePO.equal and HoarePO.leq should perform a bucket-wise comparison, as all comparable elements have the same hash and are contained in the same bucket. One may also consider implementing equal directly, instead of relying on leq

Edit: Changed HoareDomain to HoarePO, as that's what I was talking about.

jerhard avatar Jul 28 '22 16:07 jerhard

@michael-schwarz suggested adding a short circuiting physical equality check to the HoarePO.equal, i.e.

let equal x y = x == y || leq x y && leq y x

and that already increases the number of evals after about minute from ~110000 to ~190000 on SQLite.

jerhard avatar Jul 28 '22 17:07 jerhard

There's quite an extensive description of all our Hoare-(like) domains in #101, so maybe I'll have to think about this Hoare and partitioning stuff again.

The implementation of HoareDomain.equal and HoardDomain.leq should perform a bucket-wise comparison, as all comparable elements have the same hash and are contained in the same bucket.

This is a nasty hack used by AddressDomain, which only works due to the particular way hash is implemented for individual addresses. Ideally this should be a separate function, not implicitly tied to hash, but that's actually not the biggest problem. Rather, path-sensitivity is defined by the analyses' should_join pairwise operator, which isn't amenable to such bucketing.

The current implementation of HoareDomain.equal relies on checking HoareDomain.leq in both directions.

I don't think that's true. It's how HoarePO does it, but HoarePO is only used for AddressDomain due to the aforementioned reason and it implements a very particular kind of Hoare domain with a hand-rolled functional hashmap that doesn't have a should_join that can keep elements in buckets separate by force.

Actually, now I'm confused: which of the two did you time? Because one's exclusively for addresses and the other is for paths (and some privatizations).

The normal HoareDomain.Set just uses the default Set equality and probably just assumes that such sets are kept and compared only in their normal form. I guess HoarePO doesn't make that assumption, but I think it very well could, which would take all the complexity out of its equal as well and allow using the default Map equality.

sim642 avatar Jul 28 '22 17:07 sim642

And another thing to look at (depending on which of the two is the culprit): how many elements do the Hoare sets in question actually have? It could be that for some reason they collect a massive number of unjoinable elements and that's the reason for slowdown.

sim642 avatar Jul 28 '22 17:07 sim642

Actually, now I'm confused: which of the two did you time?

HoarePO, because our suspicion was that the slowdown is due to addresses of some sort.

michael-schwarz avatar Jul 28 '22 17:07 michael-schwarz

The normal HoareDomain.Set just uses the default Set equality and probably just assumes that such sets are kept and compared only in their normal form. I guess HoarePO doesn't make that assumption, but I think it very well could, which would take all the complexity out of its equal as well and allow using the default Map equality.

Here's another reason that it could easily work: HoarePO's compare and hash just operate on the Map and list structures directly, without performing any leq checks. Since those two must be compatible with equal, it seems like equal could simply do the same. Otherwise we have invalid hash and compare implementations, which themselves may be causing major issues.

sim642 avatar Jul 28 '22 17:07 sim642

It could be that for some reason they collect a massive number of unjoinable elements and that's the reason for slowdown.

Spoken aside, We are also locally experimenting with not tracking strings (or in this case as debugging hack, fixing all strings to be equal. That has kind of promising results on sqlite (after #802). Every unknown seems to have been uncovered now, and the majority of things are stable:

runtime: 00:38:03.064
vars: 62969, evals: 3666907

|rho|=62969
|called|=415
|stable|=59688
|infl|=62717
|wpoint|=562
Found 1806 contexts for 1806 functions. Top 5 functions:

michael-schwarz avatar Jul 28 '22 17:07 michael-schwarz

I suspect there's a reason why these operations have been implemented so inefficiently, address offsets don't actually have well-defined partitioning structure: https://github.com/goblint/analyzer/issues/101#issuecomment-1200996946.

sim642 avatar Aug 01 '22 10:08 sim642

Completed by #809.

sim642 avatar Sep 20 '22 11:09 sim642