l4v Improve wp_pre tracing

This makes two improvements for tracing rules applied by wp_pre.

The first is that the resulting trace will be shown when wp_pre is called directly, not just when called from wp. The current implementation does mean that this output will be labelled as "Theorems used by wp", even when called in other contexts, such as corres_pre. This could be fixed by tagging all uses of the WP_Pre.pre_tac method with a string and then passing it through to where the trace is outputted, but I thought that that would not be worth doing just to avoid mislabelled debugging output.

The second improvement is avoiding a rule being incorrectly reported as being used. wp_pre tests the resulting goals after applying a rule and weakening the precondition, and if the test fails then it discards what it has done and returns an unchanged goal state. However, the previous implementation was not aware of this and the applied rule was added to the trace even when the result was discarded.

The new implementation is more careful and only adds the rule to the trace if the test passes. However, it technically has worse performance, as it requires evaluating the resulting goals twice, without tracing during the test and then with if the test passes. An alternative implementation would be to only evaluate the goals once but then remove the last traced rule if the test fails. I decided not to use this approach due to it feeling more complicated and possibly leading to strange edge cases.

Mar 11 '24 06:03 corlewis

This looks good to me. I'm not too worried about the performance impact -- it's not like this runs in a tight inner loop somewhere. We do use it quite a lot but a few thousand times of one millisecond is still just a few seconds in an overall runtime of many hours.

I just remembered that I wrote that benchmarking script. I agree that the performance impact should be negligible but now I'm curious if we'll be able to even see it in the noise of the tests usual runtimes. I'll report back tomorrow.

Mar 14 '24 07:03 corlewis

I just remembered that I wrote that benchmarking script. I agree that the performance impact should be negligible but now I'm curious if we'll be able to even see it in the noise of the tests' usual runtimes. I'll report back tomorrow.

As expected, the difference isn't noticeable.

master: N = 50 Average RISCV64 AInvs: (00:06:10 real, 00:38:13 cpu, 13.28GB) Standard Deviation: (00:00:08 real, 00:01:06 cpu, 2.49GB) wp_pre_tracing: N = 50 Average RISCV64 AInvs: (00:06:08 real, 00:38:01 cpu, 13.37GB) Standard Deviation: (00:00:10 real, 00:01:18 cpu, 2.44GB)

Mar 14 '24 23:03 corlewis

l4v l4v copied to clipboard

Improve wp_pre tracing

l4v
l4v copied to clipboard