mgcpp
mgcpp copied to clipboard
Add special GPU memory allocators
mgcpp internally uses a lot of temporary but cudaMalloc has a really bad performance. Using low latency memory allocators should boost performance a lot.
references.
- tcmalloc, Google
- THC allocator, PyTorch,
- halloc, slides
- scatteralloc
- CMalloc
- pool allocator, Tensorflow