analyzer icon indicating copy to clipboard operation
analyzer copied to clipboard

Effective type for dynamically allocated memory

Open sim642 opened this issue 3 years ago • 0 comments

Problem

Infamously, our varinfos for dynamically allocated memory are created with void type, because at the point of malloc/etc. we don't yet know, which type the values assigned to it will have. This is only determined at the point of assignment.

This creates the need for a notorious pile of hacks, including but not limited to the following:

  1. Address casts copying the varinfo with updated type (problem for nice implementation of #731): https://github.com/goblint/analyzer/blob/35f5b9e9ce00222e2221605df4262fe3ab6158a8/src/cdomains/valueDomain.ml#L332-L333
  2. Calculating types of offsets of alloc variables fails (because you cannot have offsets on void values), so we have fallbacks like this: https://github.com/goblint/analyzer/blob/35f5b9e9ce00222e2221605df4262fe3ab6158a8/src/analyses/base.ml#L1141-L1152 There are other places with TypeOfError as well.
  3. Calculating types of address domain elements has a particularly weird fallback: https://github.com/goblint/analyzer/blob/35f5b9e9ce00222e2221605df4262fe3ab6158a8/src/cdomains/lval.ml#L257-L263 get_type_addr ((alloc@...), Field(x, NoOffset)) gives void, which gets into Type_offset from v.vtype instead of making some kind of assumption and using the field's type instead. That is, the type of an address with offsets might be the type of the variable (without the offsets). This can only cause further mismatches between types and abstract values, yielding to other type lookups also failing.

It's somewhat of a miracle how all these hacks manage to wipe the problem under multiple layers of carpet and have anything reasonable come out for blobs.

Solution

The standard defines (N1570 6.5.6):

The effective type of an object for an access to its stored value is the declared type of the object, if any.[^87] If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.

[^87]: Allocated objects have no declared type.

Given that the C standard has the notion of effective type exactly to describe how dynamically allocated memory gets its type from a subsequent assignment, we should adopt the same notion into the analyzer to avoid all the weird hacks above. Blobs could simply have an extra field for the effective type and any offset type lookups should take the effective type into count on Blobs, instead of erroring and falling back to whatnot.

sim642 avatar May 10 '22 07:05 sim642