p4c
p4c copied to clipboard
Avoid copying out/inout args when inlining functions
This is a proposal/work in progress for a way of reducing the number of copies introduced when inlining.
On particularly problematic case is when you have a big struct (eg, all the headers) that is passed around to functions as an inout
argument so they can access/modify it. Introducing a copy of all the headers is particularly inefficient, and tough to later optimize away if it was unnecessary.
This change just has a minimal check -- if the actual argument passed to an inout
or out
argument is local to the caller, it can't possibly be accessed by the callee directly, so no copy is needed -- the inlined code can just access it directly.
A more general check would be to actually look at the callee -- if it does not access whatever is passed as an argument directly or indirectly, then no copy is needed. The indirectly part is complex -- requires looking recursively at whatever the callee calls, including any extern functions or methods on instances.
Or perhaps there should be a target-specific policy that controls this -- would allow better target-specific understanding of externs and what they do.
The problem is connected to doing too much in the frontend -- function inlining happens in the frontend (when arguably it should not), in order to allow inlining into type params (bit<N>
and int<N>
in particular). If we remove the general inlining in the frontend (and only do it in the midend), it would greatly help with target-specific policies and tweaks.