lisa
lisa copied to clipboard
[FEATURE REQUEST] Pointer arithmetic
Memory model
To add support for pointers, we first have to reshape the memory model. Instead of having only the heap, the memory is now partitioned in heap and stack.
Heap | Stack | ||||||
h1 | h2 | h3 | h4 | s1 | s2 | s3 | s4 |
h5 | h6 | h7 | h8 | s5 | s6 | s7 | s8 |
Identifiers
To reflect this change, the hierarchy of the Identifier
symbolic expression has to change:
-
Identifier
-
Variable
: a real variable of the program -
MemoryLocation
: a generic location in the program memory (hn
orsn
)-
StackElement
: a location on the stack (sn
) -
HeapLocation
: a location in the heap (hn
)
-
-
MaskedIdentifier
: a compound identifier referencing another one-
Address
: the address of another identifier -
OutOfScopeIdentifier
: an identifier that has been scoped by a call or similar construct, that the code cannot directly access
-
-
Memory Expressions
The HeapExpression
s must now accommodate also operations on other parts of the memory:
-
MemoryExpression
-
Allocation
: an expression that allocates a new memory region. Corresponds tonew
/malloc
/.... Always rewrites to aMemoryLocation
-
HeapAllocation
: an expression that allocates a new heap region. Always rewrites to aHeapLocation
-
StackAllocation
: an expression that allocates a new stack region. Always rewrites to aStackElement
-
-
GetAddress(e)
: gets the address of an expressione
. Corresponds to&e
. Always rewrites to anAddress
.e
can be an arbitrary expression. -
Dereference(e)
: dereferences an address to get to the memory location. Corresponds to*e
.e
must be anAddress
. Always rewrites to anIdentifier
-
Traverse(r, e)
: gets from one location to another one. Corresponds tor.e
/r[e]
/.... Always rewrites to aMemoryLocation
.r
must be anAddress
.e
can be a string, aGlobal
, or an arbitrary expression.
-
Having a generic expression in Traverse
and GetAddress
enables usage of pointer arithmetics and tracking of fields in dynamic objects. A MemoryDomain
can use its SemanticOracle
to evaluate the expression, obtaining an instance of Lattice
that represents the location accessed.
Memory Domains
The current memory abstractions are Monolithic
, ProgramPointBased
, and FieldSensitiveProgramPointBased
. Monolithic
is fine as-is, since everything is the same locations. PogramPointBased
does not distinguish between different sub-locations, so it does not require modifications for Traverse
. Instead, for GetAddress
the evaluation can be reduced to the evaluation of their root location (e.g., &(x+1)
is the same of &x
). FieldSensitiveProgramPointBased
can instead track sub-locations, but at the current state can only track syntactic ones: x.f
works if f is a field, but x[n]
does not work if n
is a variable. This can still be acceptable on some programs that only manipulate objects with statically known fields. While the obvious endgame is to implement a shape analysis, a middle-ground implementation can be this one, that is basically a field sensitive implementation of ProgramPointBased
that uses lattice instances for fields.
Still to discuss
- aliasing vs shallow copy: how do we distinguish?
- by-ref vs by-val parameter passing