glow icon indicating copy to clipboard operation
glow copied to clipboard

[IR] Add option to allocate placeholders in the activations memory

Open mciprian13 opened this issue 4 years ago • 2 comments

Currently the placeholders are allocated in a separate memory buffer called "mutableWeights". For further memory optimization one would choose to allocate the placeholders in the "activations" memory pool such that the memory used for the input placeholders would be reused for further intermediate computations. When a model has large inputs (e.g. images) this memory gain can be significant. The downside is that the input data is overwritten but in most cases this is not a problem. Add a command line option (e.g. -allocated-placeholders-as-activations) for this mechanism to kick in (default would be false).

mciprian13 avatar Oct 22 '20 10:10 mciprian13

@mciprian13 Yeah, this idea makes sense.

A generalization of this approach would be to allow backends for customizing the mapping from a logical buffer to a contiguous memory pool it should be allocated in. This way one could e.g. map all/some mutable weights and activations to the same memory pool. Or one could map each buffer to its own memory pool if needed. This change may affect some places in LLVMIRCodeGen, because the computation of the buffer address would need to be adjusted accordingly.

BTW, your approach would probably require changes in IROptimizer.cpp, since it affects how the liveness of "mutableWeights" buffers is defined. Currently they are considered to have a global live range spanning the execution of the whole NN model, IIRC. With your proposed change, these "mutableWeights" buffers should be treated pretty much as usual activations that happen to be live upon entry. With the generalization I described above, one should probably change those buffer reuse optimizations to be applied to buffers from the same memory pool only, as doing it across memory pools does not make any sense.

opti-mix avatar Oct 22 '20 16:10 opti-mix

@opti-mix Your generalized proposal makes sense for the long-term but I don't have bandwidth right now for that kind of refactoring. I will attempt to do what I proposed here after I will investigate in greater detail the logic in IROptimizer.cpp.

mciprian13 avatar Oct 23 '20 15:10 mciprian13