owi Trim explored paths and enable collections-c for owi conc

This PR brings #351 along with a few changes to concolic execution to address infinite loops. The code might be a bit messy, so we may want to consider if there's a simpler way to incorporate the changes I've made. Below is a summary:

Assertions

To fix the issue of infinite loops caused by trying to disprove valid assertions, assertions now have a status that can be: Unknown, Valid, or Invalid.

By default, assertions start with an Unknown status and change to Invalid if an Assertion_fail is encountered at the end of a trace.
If an assertion holds during a trace, we leave its status as Unknown.
In find_model_to_run only Unknown assertions are considered to be disproved.
In find_model_to_run, when attempting to disprove an assertion, we set the status to Valid if no model can disprove the assertion, or Invalid if one can.

Not_explored Paths

To resolve the issue of exploring the same path multiple times, leading to an infinite loop, we introduce two additional nodes: Explored and Being_explored, both of which have an unexplored score of Unexplored.zero.

`Explored`

A concrete trace that passes through multiple Select nodes will inevitably end on a Not_explored node. These Not_explored nodes can safely be changed into Explored nodes, as a concrete execution of this path is unique and will not lead to any further paths.

`Being_explored`

When calling find_model_to_run, satisfiable paths are changed from Not_explored to Being_explored. This might be helpful in the future when running find_model_to_run in parallel, as it could prevent threads from running the same concrete path more than once. (I may remove this change from this PR to reduce diff size)

Collections-c

I also enabled collections-c for concolic execution to see if it was equivalent to symbolic execution. There seems to be a difference in path counts, which is expected since we are decrementing paths that ended in an Assume_fail from the path count. Something that is not done in symbolic execution.

Notes

There are a couple of things that warrant further debugging:

It seems that select_i32 might not be branching correctly in cases where value != condition?
I'm still not completely certain that the paths are being marked correctly. I'll compare path counts with wasp on collections to confirm.

Aug 10 '24 10:08 filipeom

This looks good to me. Did you check if we're still returning empty models on TestComp with this?

Aug 13 '24 12:08 redianthus

Last run of testcomp:

Benchmark results (commit hash=18e6fc28) :octopus:

Using:

Tool: owi_w8_O3_sZ3_concolic
Timeout: 30.
Output dir: results-testcomp-owi_w8_O3_sZ3_concolic-2024-08-11_11h29m49s/

Results:

Nothing	Reached	Timeout	Other	Killed	Total
34	600	499	83	0	1216

Time stats (in seconds):

Total	Mean	Median	Min	Max
15426.81	12.69	0.35	0.09	30.25

Aug 13 '24 13:08 filipeom

This looks good to me. Did you check if we're still returning empty models on TestComp with this?

Seems like the reached results are ok. It is expected that some models will be empty, as some tests don't actually create symbolic variables (but only a few benchmarks do this).

In this PR I still want to fix the 83 that are in Other and remove the path counts from the output

Aug 13 '24 13:08 filipeom

some tests don't actually create symbolic variables (but only a few benchmarks do this)

Oh really ? I did not know about that and I was thinking it was a mistake. But, why isn't it the case in the symbolic mode ? (I checked there and IIRC there was no empty model...).

Aug 13 '24 13:08 redianthus

some tests don't actually create symbolic variables (but only a few benchmarks do this)

Oh really ? I did not know about that and I was thinking it was a mistake. But, why isn't it the case in the symbolic mode ? (I checked there and IIRC there was no empty model...).

It should be, check the product-lines/* benchmarks. Most of them will be empty (Starting in 855 for example)

Aug 13 '24 14:08 filipeom

OK great! Should I merge this then?

Aug 17 '24 11:08 redianthus

OK great! Should I merge this then?

We can yes. select_i32 is not branching, but this should not impact the the analysis to much, at most we will get some Nothing's or Other's but the Reached should be correct.

We should address select_i32 in another PR

Aug 17 '24 14:08 filipeom

Thanks!

Aug 18 '24 20:08 redianthus