Substitute pow2bin allocator with a dlmalloc based allocator
- imports dlmalloc (latest version MIT licensed)
- create an encapsulation class DLMalloc that exposes only relevant functionalities, to prevent using non-static/templated members of parent class we use the mspace variant of dlmalloc
- use DLMalloc in a new ShmemAllocatorStrategy
- replace pow2bin in single_heap
Possible drawbacks/missing features:
- the MORECORE functionality is not implemented (would cause dependency between non-static class members and static functions) -> we cannot resize the symmetric heap (initial allocation only)
- MSPACE allocator stores some metadata in the symmetric heap, that means that metadata in device memory is manipulated from the host when doing the allocation.
- I have set the alignment to 2MB, this may be an overkill in some cases, maybe using the memalign functionality of dlmalloc would be a better approach.
- Performance is untested
- Unit testing?
@abouteiller to follow up on the comment on the JIRA ticket, is there a test (or at least visual confirmation by reading the code) that dlmalloc is able to combine two freed memory allocation stemming from separate rocshmem_alloc/free() operations if they end-up being consecutive in the memory? This was ultimately why we wanted to replace the previous allocator
@abouteiller to follow up on the comment on the JIRA ticket, is there a test (or at least visual confirmation by reading the code) that dlmalloc is able to combine two freed memory allocation stemming from separate rocshmem_alloc/free() operations if they end-up being consecutive in the memory? This was ultimately why we wanted to replace the previous allocator
Yes this is visible in mspace_free:5706 and on, freed chunks are consolidated with preceding, suceeding free space. It also look for a best-fit chunk in the gaps for allocating chunks (with 2 different strategies for small and large requests).
Passes all tests now, ready for review