CoreNeuron
CoreNeuron copied to clipboard
CoreNEURON memory allocation routines are confusing
Describe the issue
The current codebase uses a wide range of different memory [de]allocation routines, starting from a mix of new
/delete
and malloc
/free
and extending (via other system calls like posix_memalign
) to various homespun methods like coreneuron::[de]allocate_unified
:
https://github.com/BlueBrain/CoreNeuron/blob/63066908dc4ca12f25b93906910a4aecdb3af12e/coreneuron/utils/memory.h#L33-L39
alloc_memory
, calloc_memory
, free_memory
:
https://github.com/BlueBrain/CoreNeuron/blob/63066908dc4ca12f25b93906910a4aecdb3af12e/coreneuron/utils/memory.h#L33-L39
and helper structs like MemoryManaged
:
https://github.com/BlueBrain/CoreNeuron/blob/63066908dc4ca12f25b93906910a4aecdb3af12e/coreneuron/utils/memory.h#L142-L167
Some of these names are not very descriptive, and some of their behaviours change according to compile time and runtime options, which has led to bugs (see https://github.com/BlueBrain/CoreNeuron/issues/594, for example) when (for example) we end up with mismatched allocations and deallocations.
This should be improved, with more descriptively named methods and better organisation that enforces consistent pairing of allocation and deallocation functions.
Discussion Note that we need to be able to request a few different types of allocation. For example, in GPU builds, we may need to distinguish between:
- Host-only memory.
- Unified memory (accessible from both host and device) even when
CORENRN_ENABLE_CUDA_UNIFIED_MEMORY=OFF
, for things like Random123 state where we require unified memory (https://github.com/BlueBrain/CoreNeuron/pull/595) - Unified memory if
CORENRN_ENABLE_CUDA_UNIFIED_MEMORY=ON
, otherwise host memory.
Additionally the last two should probably return host-only memory in GPU builds where the GPU was not enabled at runtime by passing --gpu
or coreneuron.gpu = True
.
Ideally we would handle these through a more uniform API that makes these distinctions clear.
Right now, {alloc,calloc,free}_memory
and MemoryManaged
provide point 3 (though without a test on --gpu
), coreneuron::[de]allocate_unified
provide point 2, and a mix of standard APIs provide point 1.