liballocs
liballocs copied to clipboard
sizeofness analysis isn't sufficiently general
Currently, we do an intraprocedural analysis of the flow of 'sizeofness', so that e.g. in the following example we can infer that p points to a struct Foo.
size_t s = sizeof (struct Foo);
unsigned t = 2 * s;
void *p = malloc(t);
However, sizeofness is not tracked interprocedurally. A consequence is that it's necessary to declare a wrapper such as in this variation:
void *p = mymalloc(t);
and also a 'helper' such as in this variation:
unsigned t = get_size_of_struct_foo();
and some seen-in-the-wild cases are just not supported, such as (the 'perlbench case'):
struct blah {
...
size_t size_of_a_struct_foo;
...
} the_blah = { ..., sizeof (struct Foo), ...};
The 'helper' case is handled by a hack, where helpers can be declared much like a malloc wrapper, and we grossly assume that the preceding helper call executed in a given thread is the one that a given otherwise-unclassified malloc call should be associated with. This allows a simple implementation using a thread-local variable, but is easily foiled e.g. by computing two sizes but doing the allocations in the other order.
One idea for doing better: we can easily make 'sizeofness-returningness' an annotation whose completeness we check by a simple static analysis, i.e. warn if you return an expression having sizeofness but are not marked as a helper. Then instead of the thread-local hack, generate sizeofness at calls to helpers, just as we generate it at 'sizeof' itself... that's if helpers always generate same-type sizeofness.
To handle the perlbench case, use a link-time map for the case of static-constant values with sizeofness.
A full interprocedural sizeofness analysis, including per-function summaries and a link-time composition step, is a further development that I have sketched out elsewhere.
Here is my sketch.
We want to eliminate the need to declare malloc wrappers and the like. (And perhaps even the need to declare suballocators, although that is a bit separate.)
The idea is a simple flow-insensitive analysis that can produce per-function summaries for how sizeofness is propagated by that function.
All parameters received, return values received and not-definitely-local memory reads are input operations. All parameters passed, memory written and returned values are output operations.
The task is to produce a summary which, for each output, gives an approximation of its dependency, if any, on input sizeofnesses.
Suppose we number the input operations. The summary is either T for some type T, or it's _1, _2, _3 etc to say 'the sizeofness received by input _1/_2/_3'. That is clearly making a static approximation of assuming the same input operation, in a given activation of the function, always inputs the same sizeofness. (But it remains parametric, i.e. across different activations it might be instantiated with different sizeofnesses.)
The idea is that instead of just analysing how sizeofness is sourced and propagated intraprocedurally, our summary also captures how sizeofness is propagated interprocedurally.
Then, given a summary for each function in a stack trace, we can work backwards from the allocation function up the stack to identify (approximately, but hopefully accurately in practice) any sizeofness that it is receiving.
So, for a malloc wrapper, we would infer automatically a summary that is equivalent to the spec that is needed currently (e.g. my_malloc(Z)p).
Small extension for perlbench: we also need a sizeofness quasi-summarised for globals, so that it can be sourced by a read from a global. A simple version is just a mapping that says 'this global has this sizeofness'. We could ideally infer it from:
- initializers (the load-time case)
- direct writes to the global (the run-time case)
If all writes we see (hopefully only one) write the same sizeofness into a global, we might infer that it has that sizeofness. That is obviously unsound, thanks to indirect writes.
An experiment could usefully compare whether we can do as well as the current approach that uses __current_allocsite. i.e. this is the "manual" approach using proactively identified wrappers to propagate allocation site information, and proactively latched allocation sites at the base of the allocation call chain. Instead we'd be precomputing only a summary for all functions, then reactively inferring sizeofness as allocation calls come in at run time, by walking up the stack and reasoning retroactively back up the callchain.
One problem with this is that all functions, even ones that don't have any role in passing sizeof around, would need a summary that is represented at run time. For example, a function that takes an int xand returns 2*x would still get a sizeofness summary. We could avoid this using a static call graph calculation, although by itself this would only let us prune functions that definitely never call malloc (or other allocation function) either themselves or transitively in any callee. A few more functions could be pruned if they have a literal known sizeofness. E.g. if I call f(42) but its only callee is g(sizeof (float)), we know that f is not passing a sizeofness to g so that call graph edge can be pruned. In other words, the graph we want is the graph if possibly-sizeofness-passing calls/returns. Summaries are only needed for functions appearing in this graph. That is still potentially a lot of functions... unclear!
Also, since any global non-hidden function symbol is available for dynamic linking, we can't see all its callers and so if we are being maximally conservative, we have to assume that any integer argument might be receiving a value with sizeofness. If it provably does not pass that sizeofness on to its callees then we can prune it from the graph, but that seems unlikely.