z3 icon indicating copy to clipboard operation
z3 copied to clipboard

Longer proofs with recent versions of Z3

Open clairedross opened this issue 3 years ago • 6 comments

Hello Z3 team,

while maintaining a SPARK codebase from a few years ago (red black trees) I noticed a notable regression in the provability. In fact, most checks in the code were proved by Z3 which is notably more efficient than other solvers on these checks. Now some of them are not proved/take longer to prove (I had more than 50 new unproved checks with a timeout of 2 minutes per check). Attached is one such VC which was proved by Z3 4.8.6 in around 1 minute and takes more than 4 minutes to prove with Z3 4.8.10. I know this is not a bug, but I was wondering if somebody knew what could have caused the change, and if there were some options we could try to get back to the previous behavior.

Thanks in advance, Claire post3_smt2.txt

clairedross avatar Jul 09 '21 08:07 clairedross

Thanks for the sample. It will be useful for investigation. Note that it will take me some time have an opportunity to run a bisect to narrow down which changes matter (after July 18th).

NikolajBjorner avatar Jul 10 '21 21:07 NikolajBjorner

Thanks!

On Sat, Jul 10, 2021 at 11:04 PM Nikolaj Bjorner @.***> wrote:

Thanks for the sample. It will be useful for investigation. Note that it will take me some time have an opportunity to run a bisect to narrow down which changes matter (after July 18th).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Z3Prover/z3/issues/5395#issuecomment-877704121, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB6NCMMV6FGDM5K3NDLBWGDTXCYUPANCNFSM5ACKCX3A .

-- Claire Dross Senior Software Engineer, AdaCore

clairedross avatar Jul 12 '21 06:07 clairedross

bisection suggested that the commit that changed the behavior on this example was a small change to a hash function. This shouldn't have a material negative effect on performance, rather it should result is slightly better average case behavior. Guided by this circumstance I tried a few different random seeds on your example. Indeed it solves before and after the change with widely different runtimes 20s - 1100s depending on the random seed. There could be other regressions introduced after 38ad66ce170050d114050c3ad4f8c8e843aea85e, but this particular example isn't ideal as performance fluctuates so widely at the point where it is possible to observe a runtime difference.

NikolajBjorner avatar Jul 19 '21 23:07 NikolajBjorner

Hi Nikolaj,

On Tue, Jul 20, 2021 at 1:36 AM Nikolaj Bjorner @.***> wrote:

bisection suggested that the commit that changed the behavior on this example was a small change to a hash function. This shouldn't have a material negative effect on performance, rather it should result is slightly better average case behavior. Guided by this circumstance I tried a few different random seeds on your example. Indeed it solves before and after the change with widely different runtimes 20s - 1100s depending on the random seed. There could be other regressions introduced after 38ad66c https://github.com/Z3Prover/z3/commit/38ad66ce170050d114050c3ad4f8c8e843aea85e, but this particular example isn't ideal as performance fluctuates so widely at the point where it is possible to observe a runtime difference.

Thanks a lot for taking the time to look into this. We simply got very unlucky then.

Best,

Claire Dross Senior Software Engineer, AdaCore

clairedross avatar Jul 20 '21 07:07 clairedross

I don’t understand the details of the problem but there are hash functions designed to avoid this sort of problem. Could you give more details of the hash function being used?

lesshaste avatar Oct 18 '21 05:10 lesshaste

We hash expression objects and the hash is used for insertion into a hash-table. The insertion order matters when the tables are traversed and nodes get recycled after garbage collection. This affects the order the solver uses when making decisions, the decision order affects how search finds conflicts and that affects runtime. Suppose we have a hash function and change it by adding +1 to the result. This would also affect search ordering whenever traversing hash-tables for that function interacts with search.

NikolajBjorner avatar Oct 18 '21 12:10 NikolajBjorner