Enzyme
Enzyme copied to clipboard
Custom tape allocator support
Capturing an offline discussion with @wsmoses
Stage 1:
- allow frontend to register a custom allocation function for tapes
- Needs some thought on CUDA device
- Address space might differ from normal malloc, the goal is to turn tapes into full Julia objects.
- Frontend should also be able to provide a custom free or indicate that a free is not needed.
Stage 2:
- Besides the tape size, also provide a runtime layout descriptor. This is needed for GC support so that Enzyme.jl can find sub-tapes and Julia objects stored on the tape.
- We might need support for emitting write barriers, e.g. when we store a Julia object to the tape we will have to insert a call to an intrinsic
We should also have allocators with different alignments for leaf node vector mode.