analyzer Unsound address set must-not-equality check in base

The use of AD.meet to check address set must-not-equality: https://github.com/goblint/analyzer/blob/3a189a9e85e9f494f40e44012f864e663274181b/src/analyses/base.ml#L331-L334 is unsound in at least the following cases:

[ ] First struct field and zero index: https://github.com/goblint/analyzer/blob/3a189a9e85e9f494f40e44012f864e663274181b/tests/regression/02-base/91-ad-meet.c#L6-L16
[ ] Union fields: https://github.com/goblint/analyzer/blob/3a189a9e85e9f494f40e44012f864e663274181b/tests/regression/02-base/92-ad-union-fields.c#L6-L17
[ ] Non-first struct field and non-zero index with alignment: https://github.com/goblint/analyzer/issues/822#issuecomment-1212175339.

For both programs we report that the assert definitely fails, but when compiled and run, it succeeds.

Using meet would only work if addresses were represented in their canonical form using a byte offset from the variable.

Aug 11 '22 11:08 sim642

The unsoundness of comparing addresses with a struct field and an index as offsets is also exhibited for non-zero offsets:

https://github.com/goblint/analyzer/blob/712c6cef810aa3c6635a677bc9eef7dd8e7cc599/tests/regression/02-base/93-ad-struct-offset.c#L9-L27

The alignment of fields within objects is implementation defined, thus one generally cannot produce definite answers when comparing integer offsets with field offsets on structs.

Aug 11 '22 15:08 jerhard

Since #809 makes this problem even more urgent, we probably need a quicker fix than completely refactoring lvalue representations to carry a canonical form or whatnot. A simpler implementation for just AD.may_equal a1 a2 would be to pairwise check all lvalues from a1 and a2, and check their may equality. That could be implemented by calculating the canonical offset numerically, a la Cil.bitsOffset and checking those integers for may equality. The only problem is that Cil.bitsOffset works only with constant indices, but we'll have abstract ones, so it'll have to be partially reimplemented in Goblint. However, CIL can still provide us with the byte/bit offsets of struct fields (using Machdep), so it shouldn't be too difficult.

Sep 02 '22 08:09 sim642

A remark from Zulip on optimizing the pairwise may alias checking. Checking all pairs is overly wasteful because addresses may only alias if their varinfos (without offsets) are equal, so it suffices to do pairwise checking within each varinfo.

The projective bucketing from #809 doesn't directly allow that because the ProjectiveSet buckets are per varinfo and representative offset. However, that could be replaced with a two nested ProjectiveSets: outer split by varinfo only and inner split by representative offset in each varinfo. Then may alias checking can be performed pairwise within each outer bucket, but over all the elements in the corresponding inner buckets. Currently ProjectiveSet completely abstracts away the map structure, so an additional special function (maybe bucket_exists2?) would have to be implemented to do that.

Sep 22 '22 07:09 sim642

analyzer analyzer copied to clipboard

Unsound address set must-not-equality check in base

analyzer
analyzer copied to clipboard