analyzer icon indicating copy to clipboard operation
analyzer copied to clipboard

Proposal: Unify handling of malloc and escaping locals

Open michael-schwarz opened this issue 1 year ago • 2 comments

Currently, malloc blobs are syntactical globals, i.e., they become shared between threads upon their creation. However, additional such variables can be created after having entered multi-threaded mode. Maybe it would be a good idea to not make them syntactic globals, but use some mechanism to escape them as soon as their address is leaked into something that is a global (and emit escape events)?

This would allow unifying the handling of escaping locals and malloc blobs somewhat. It would mean that there is only flavor of introducing new global variables into the global invariant to handle instead of two. Hopefully, one such mechanism would be less fragile than what we currently have.

This may also have some precision benefits in cases where a malloc is thread-local and the block never needs to be escaped, as it can be handled flow-sensitively in this case, (and updates may sometimes even be destructive).

The challenges are:

  • What to do about non-unique threads, and non-unique blobs?
  • What happens if a blob escapes in one context and not the other? Do pointers then point potentially to both?
  • How does this relate to #69, #1069, #1070

Maybe it also time for us to rethink all this Escape, Malloc, ... business, provide sound theoretical foundations for it, and write it up somewhere. I don't think there is much existing work on this, but it is clearly of relevance for thread-modular analyses of globals.

michael-schwarz avatar May 27 '24 15:05 michael-schwarz

This may also have some precision benefits in cases where a malloc is thread-local and the block never needs to be escaped, as it can be handled flow-sensitively in this case, (and updates may sometimes even be destructive).

This indeed is a value-based variation of what mallocFresh tried to do just for races. Since we couldn't correctly nail that down, I suspect it will only be more involved for value analysis (considering how tricky protection-based value-analysis turned out to be compared to protection-based race analysis).

Standard recency analysis might be the way to deal with non-uniqueness, but I don't know a way to easily implement it in our framework. In an allocation loop, the second iteration must turn recent things into non-recent (non-unique) things, so all analyses that use varinfos must know to do a deep substitution before the second allocation at a site. But it might already have propagated into global unknowns that cannot just be substituted like that.

sim642 avatar May 28 '24 07:05 sim642

This may be interesting for the escape analysis: https://dl.acm.org/doi/10.1145/3103111.3104039

michael-schwarz avatar Aug 16 '24 12:08 michael-schwarz