analyzer Eager destab, cheap abort

Eager destab, cheap abort

Open michael-schwarz opened this issue 2 years ago • 15 comments

This WIP is the Munich idea for aborting. It's still far from polished, just thought I'd open the PR in case someone wants to take a look.

TODOs:

[x] Cleanup code
[x] Test impact
[x] Decide if we want to have it
[x] Integrate with interactive solver

May 18 '22 10:05 michael-schwarz

How does this relate to your aborting @sim64? It somehow seems less involved?

May 18 '22 11:05 michael-schwarz

Somehow, the benefit seems unclear at best: For ./goblint --conf conf/traces.json --set ana.base.privatization write+lock ../bench/pthread/smtprc_comb.c, we actually observe a slowdown by a factor of 10(!).

Timings:
TOTAL                           1.327 s
  parse                           0.038 s
  convert to CIL                  0.028 s
  mergeCIL                        0.006 s
  analysis                        1.256 s
    createCFG                       0.028 s
      handle                          0.012 s
      iter_connect                    0.015 s
        computeSCCs                     0.012 s
    global_inits                    0.005 s
    solving                         1.223 s
      S.Dom.equal                     0.010 s
      cheap_full_reach                0.002 s
      postsolver                      0.380 s
        postsolver_iter                 0.375 s
    warn_global                     0.000 s
      access                          0.000 s
Timing used
Memory statistics: total=3680.82MB, max=37.52MB, minor=3678.05MB, major=44.41MB, promoted=41.64MB
    minor collections=1757  major collections=9 compactions=0

Timings:
TOTAL                           9.385 s
  parse                           0.050 s
  convert to CIL                  0.028 s
  mergeCIL                        0.006 s
  analysis                        9.303 s
    createCFG                       0.028 s
      handle                          0.012 s
      iter_connect                    0.015 s
        computeSCCs                     0.013 s
    global_inits                    0.004 s
    solving                         9.266 s
      S.Dom.equal                     0.010 s
      cheap_full_reach                0.002 s
      postsolver                      0.366 s
        postsolver_iter                 0.361 s
    warn_global                     0.001 s
      access                          0.001 s
Timing used
Memory statistics: total=32637.48MB, max=351.13MB, minor=32634.65MB, major=418.49MB, promoted=415.67MB
    minor collections=15568  major collections=17 compactions=0

May 18 '22 11:05 michael-schwarz

Just to have this here, my earlier take on this as a constraint system functor and actually using exceptions for aborting right-hand sides is here: https://github.com/goblint/analyzer/commit/531b8a60fce17a99a3fc1a6eee49ca757539e295 (also in caching-constrsys branch).

May 18 '22 15:05 sim642

@jerhard and I managed to solve the performance degradation: It seems like this dep_vals was populated to eagerly!

May 18 '22 16:05 michael-schwarz

After this change, we save some 1000 evaluations, and end up with comparable performance in both cases.

May 18 '22 16:05 michael-schwarz

@jerhard and I managed to solve the performance degradation: It seems like this dep_vals was populated to eagerly!

Looks like after pulling that dep_vals populating code from eval to around S.system in solve, it's all very localized as a wrapper around S.system, like my caching-constrsys attempt. If it's like that, then it's still my opinion that it's nicer to pull it out as a constraint system functor in order to keep TD3 itself as clean as possible. But that can be future discussion when this becomes fast enough to make a difference.

May 19 '22 07:05 sim642

This probably only matters if this PR provides meaningful speedup and we want to have it, but currently this is on top of #391, where all aborting was now reverted since it wasn't part of the paper. Since it's separate, then it might be more general to have directly on top of master. Or maybe this aborting stuff becomes a bigger venture itself, where we want to integrate all the different approaches for benchmarking comparison.

May 19 '22 07:05 sim642

If it's like that, then it's still my opinion that it's nicer to pull it out as a constraint system functor in order to keep TD3 itself as clean as possible.

Since it's separate, then it might be more general to have directly on top of master.

The reason i did it here is that we were hoping it could be an generalization of all the reluctant destabilization business that clutters TD3 now. I think having this in there is much less clutter than all of that business.

I also don't think it will stay as separate as it currently is, as my next ambition would be to somehow integrate this with superstable such that warnings etc can be re-used for things that were destabilized, but their RHS never needed to be evaluated. This would IMO bridge the distance between reluctant destab and what we have here.

I think as a name for this nascent feature, skip-unchanged-rhs might be better than aborting as there really isn't anything being aborted here.

May 19 '22 08:05 michael-schwarz

my next ambition would be to somehow integrate this with superstable such that warnings etc can be re-used for things that were destabilized, but their RHS never needed to be evaluated.

This is an interesting point, but maybe it doesn't have to be directly related to this even. Maybe our definition of superstable should be different, i.e. things only get removed when values are actually changed in solve/side, not when they are immediately destabilized. That would automatically cover the case when the RHS was never re-evaluated and the value remained the same.

May 19 '22 08:05 sim642

Maybe our definition of superstable should be different, i.e. things only get removed when values are actually changed in solve/side, not when they are immediately destabilized.

Hmm, that's true. There's no need to consider things that became destabilized once no longer superstable now that we have the cheap reachability and unreachable things being superstable is no longer a concern to us.

May 19 '22 08:05 michael-schwarz

The performance story is still a bit mucky: sometimes we have big savings, sometimes it doesn't matter, and sometimes it is worse.

abort

May 19 '22 09:05 michael-schwarz

I now added marshalling to see how this works together with the incremental solver: Currently, many incremental tests still fail because the fix-point is not reached (9 test(s) failed: ["00/00 local", "00/09 unreach", "00/11 unreach-reusesuper", "03/03 reluctant-int-annotation-dyn", "13/01 mutex-simple", "13/05 race-call-remove", "13/06 mutex-simple-reluctant", "13/07 mutex-simple-reluctant2", "13/08 mutex-simple-reluctant3"]).

Any idea what could be wrong here @stilscher @jerhard? It would seem that we have to clear this hash-table for some unknowns if the constraints for them now changed or something like this?

May 19 '22 11:05 michael-schwarz

The performance story is still a bit mucky: sometimes we have big savings, sometimes it doesn't matter, and sometimes it is worse.

It would be a good idea to also do SV-COMP runs. It's a much bigger sample size and can very easily see speedup/slowdown trends overall instead of eyeballing.

May 19 '22 11:05 sim642

Apparently, this will be included in a paper we're writing on the Munich side (The Cousot Festschrift), so we should attempt to get it merged.

Jul 28 '22 13:07 michael-schwarz

Given the renowned activity on this, I guess you're intending to polish this into a working state.

If so, then we should really think about this point I had in some of my earlier comments:

Just to have this here, my earlier take on this as a constraint system functor and actually using exceptions for aborting right-hand sides is here: 531b8a6 (also in caching-constrsys branch).

Looks like after pulling that dep_vals populating code from eval to around S.system in solve, it's all very localized as a wrapper around S.system, like my caching-constrsys attempt. If it's like that, then it's still my opinion that it's nicer to pull it out as a constraint system functor in order to keep TD3 itself as clean as possible. But that can be future discussion when this becomes fast enough to make a difference.

If we still want to achieve #545 (which was started in #596), then hard-coding this into TD3 specifically is a step in the wrong direction. Coupling more features into the already monolithic TD3 solver just creates more future work to maintain the even larger solver and more things to decouple. So it would be highly preferable, if we can avoid adding more coupling then there already is. I recall there being an email chain with Helmut regarding how this should be presented and I think we realized at the time it would've been equivalent to a lifting of the right-hand sides. Of course presenting such technique on slides is easier if it's inlined into the solver, but that's quite different from the real-world implementation of TD3, which is over 1000 lines by now.

Therefore, what we should think about is whether any part of this (as it is by now) necessarily needs to be coupled into TD3 or could it still be achieved as a constraint-system lifting functor.

Sep 15 '22 07:09 sim642

I think the reason we are opposed to having it as a functor around the constraint system on the Munich side is that any such data structures that are not maintained in the solver a prone to later breakage, and other unpredictable results. I don't think that this wrapping of TD3 adds much to the complexity.

I agree that, in principle, decoupling things as far as possible is beneficial. However, there are easier wins (such as all the incremental pre-processing).

Oct 23 '22 12:10 michael-schwarz

I think the reason we are opposed to having it as a functor around the constraint system on the Munich side is that any such data structures that are not maintained in the solver a prone to later breakage, and other unpredictable results. I don't think that this wrapping of TD3 adds much to the complexity.

Wrapping the constraint system and wrapping TD3 are essentially the same thing: a functor between the solver and the constraint system. As it is right now, it's not really wrapping TD3, but simply inlining the wrapper code right in the middle of TD3.

Also, there's no reason the transfer function wrapping cannot be decoupled and still be in TD3 at the same time. The incremental warning handling does precisely that: https://github.com/goblint/analyzer/blob/abc2c404658b68084641220d55a9ff5f1601320f/src/solvers/td3.ml#L85-L86 It does a very similar thing: wraps all right-hand side executions with some current constraint variable tracking without having to insert all that right in the middle of TD3, where it's very self-contained instead of blurring the lines between what's TD3 itself and what's the right-hand side aborting feature. And it can be very close to the particular solver: the solver instantiates the functor and has access to the dep_vals data structure used in the functor (for example to marshal it).

I agree that, in principle, decoupling things as far as possible is beneficial. However, there are easier wins (such as all the incremental pre-processing).

These are two completely unrelated wins. Incremental preprocessing is an optimization, modularizing TD3 is a matter of maintenance. The high degree of coupling was a primary reason for reverting #364, despite it providing up to 50% speedup with no regressions, which was more than this, if I recall correctly. Although there haven't been recent sv-benchmarks runs with this recently, only non-relational traces benchmarks.

Ultimately I will not stop this from being included into TD3 inlined instead of modularized. But in that case I expect to not hear any complaints about the size of the TD3 implementation either.

Oct 24 '22 07:10 sim642

Let's puts this onto the agenda for Wednesday :smile:

However, there are easier wins (such as all the incremental pre-processing)

What I meant here was removing the pre-processing of all the solver data structures for the incremental mode from the solver itself and putting it somewhere else.

Oct 24 '22 08:10 michael-schwarz

What I meant here was removing the pre-processing of all the solver data structures for the incremental mode from the solver itself and putting it somewhere else.

#545 does that to an extent it's still possible. These data structures are solver-specific and the concrete type only defined within the solver, so they can't be arbitrarily elsewhere, only extracted out of the solve function.

Oct 25 '22 07:10 sim642

I did the refactoring currently in a separate branch: https://github.com/goblint/analyzer/tree/eager_destab_cheap_abort-refactor. Since it depends on #909, I didn't yet merge it here in order to not clutter the diff with changes that need to be reviewed separately first.

As far as I can see, the last status of benchmarking this was on the tiny set of traces benchmarks:

The performance story is still a bit mucky: sometimes we have big savings, sometimes it doesn't matter, and sometimes it is worse.

If this is to become enabled by default, we need comprehensive performance comparison.

Nov 17 '22 09:11 sim642

If this is to become enabled by default, we need comprehensive performance comparison.

Yes! I would suggest to postpone this until after the sv-comp deadline.

Nov 17 '22 13:11 michael-schwarz

I now ran this on sv-benchmarks and there it doesn't help, but has ~3% slowdown: (Full table)

Dec 05 '22 15:12 sim642

Even if there is no speedup everywhere, I think it would make sense to merge this, as it also something we describe in our TD0 to TD3 paper...

Dec 18 '22 14:12 michael-schwarz

Sure, I just disabled it by default for now.

Dec 19 '22 10:12 sim642

analyzer analyzer copied to clipboard

Eager destab, cheap abort

analyzer
analyzer copied to clipboard