kani Verification time increased by 7X on an s2n-quic harness

Noticed today that s2n-quic's vectored_copy_fuzz_test harness takes 30 minutes to prove, even though a few weeks ago, it only took 4 minutes. The culprit commit turned out to be ab0f2add72b9fd49a8410b7de981702cc3e784d6 from Aug 18. With this commit, it takes 30 minutes, but with the previous commit (8b1d18c436353a4428f58be268d1a2df04015faf), it takes 4 minutes.

To reproduce, do:

git checkout 8b1d18c436353a4428f58be268d1a2df04015faf
cargo clean
cargo build --workspace
cd tests/perf/s2n-quic/quic/s2n-quic-core
cargo kani --harness vectored_copy_fuzz_test This produces:

SUMMARY:
 ** 0 of 664 failed (1 unreachable)

VERIFICATION:- SUCCESSFUL

Verification Time: 214.41763s

Do git checkout ab0f2add72b9fd49a8410b7de981702cc3e784d6
Repeat steps 2 - 5 Output:


SUMMARY:
 ** 0 of 664 failed (1 unreachable)

VERIFICATION:- SUCCESSFUL

Verification Time: 1803.3752s

Sep 17 '22 06:09 zhassan-aws

@tedinski, any ideas what may have caused the performance slowdown in https://github.com/model-checking/kani/commit/ab0f2add72b9fd49a8410b7de981702cc3e784d6?

Sep 19 '22 17:09 zhassan-aws

That PR turned assert into assert-assume. Normally we'd think of it as pruning branches, and so I'd assume it'd improve performance if anything.

But perhaps it's complicating the state with new assumptions along the paths it does follow.

Sep 19 '22 22:09 tedinski

Summarizing the current status of the investigation:

There are at least a couple of commits that contributed to the increase in verification time:

https://github.com/model-checking/kani/commit/ab0f2add72b9fd49a8410b7de981702cc3e784d6 (4 minutes to 30 minutes): in particular the change from assert to assert-assume for the assume intrinsic: https://github.com/model-checking/kani/commit/ab0f2add72b9fd49a8410b7de981702cc3e784d6#diff-77bb4928612d2e4c4cd8a6ff43124d05f89d423aaa128656e5a9635dd01e2648R443 and codegen_offset: https://github.com/model-checking/kani/commit/ab0f2add72b9fd49a8410b7de981702cc3e784d6#diff-77bb4928612d2e4c4cd8a6ff43124d05f89d423aaa128656e5a9635dd01e2648R1060

Rewinding back to this commit and changing those two codegen_assert_assume to codegen_assert bring the time down to 4 minutes. However, making those two changes off of more recent commits (e.g. 1f40c6542caea48247dcd38155137cd74b98fbff), only brings down the time from 60 minutes to 30 minutes.

We tried introducing a temporary variable for the condition that is then used in the assert and assume, but that didn't make a difference.

https://github.com/model-checking/kani/commit/07092ded6699cfdcb159138eae1f25b160dca529 (30 minutes to 60 minutes): The rust toolchain update included a change in niche optimization, which is likely the cause of the performance regression.

Nov 10 '22 17:11 zhassan-aws

With kissat, this harness completes in ~1 minute:

$ cargo kani --tests --harness vectored_copy_fuzz_test --no-assertion-reach-checks --enable-unstable --cbmc-args --external-sat-solver kissat
...
SUMMARY:
 ** 0 of 723 failed

VERIFICATION:- SUCCESSFUL
Verification Time: 59.52028s

I had to disable reachability checks though (--no-assertion-reach-checks) because of #1962. However, even if I disable them but use the default SAT solver, it takes close to an hour:

$ cargo kani --tests --harness vectored_copy_fuzz_test --no-assertion-reach-checks
...
SUMMARY:
 ** 0 of 723 failed

VERIFICATION:- SUCCESSFUL
Verification Time: 3266.809s

Dec 21 '22 05:12 zhassan-aws

@tautschnig These are the current results with 9439a540521 and with CBMC https://github.com/diffblue/cbmc/commit/42d5ce201bb12296e703699c384deedd0e56f08b built with MiniSAT 2 and CaDiCaL.

With MiniSAT:

cargo kani --tests --harness vectored_copy_fuzz_test
Verification Time: 2485.4717s

With CaDiCaL:

$ cargo kani --tests --harness vectored_copy_fuzz_test --enable-unstable --cbmc-args --sat-solver cadical
Verification Time: 102.70173s

With Kissat:

$ cargo kani --tests --harness vectored_copy_fuzz_test --enable-unstable --cbmc-args --external-sat-solver kissat
Verification Time: 121.14342s

So CaDiCaL is indeed slightly faster.

All results can be reproduced by doing cd tests/perf/s2n-quic/quic/s2n-quic-core and running the specified command.

Feb 15 '23 23:02 zhassan-aws

Resolving with following rationale:

We understand that it is the assert-followed-by-assume that increased the runtime (https://github.com/model-checking/kani/commit/ab0f2add72b9fd49a8410b7de981702cc3e784d6), but we do still want to keep that change.
The increased runtime is only observed with MiniSat, which no longer is the default with Kani (as of #2654).
The s2n-quic harness has long ago been enabled by first choosing Kissat (https://github.com/aws/s2n-quic/pull/1628) and later CaDiCaL (https://github.com/aws/s2n-quic/pull/1771). It's now part of our performance benchmarking that we do in CI.

Aug 18 '23 07:08 tautschnig

kani kani copied to clipboard

Verification time increased by 7X on an s2n-quic harness

kani
kani copied to clipboard