ncnn icon indicating copy to clipboard operation
ncnn copied to clipboard

Memory Pool Improvement For Variadic Sized Inputs

Open LinHeLurking opened this issue 2 years ago • 3 comments

A Better Memory Pool With Simple Greedy Strategy

This pull request contains several modifications on current ncnn pool allocators.

Basic Assumption

The program runs in multiple stages. In each stage, memory sizes the program is asking for are lied in an (relatively small) interval. When program goes into a new stage, the size interval will change. The intervals may or may not overlap.

Double Ended Greedy Strategy

If all cached chunks in the pool cannot satisfy current allocation, the allocator tries to remove an outdated chunk.

  • If this allocation size is larger than any previous chunk, remove the smallest chunk (see below).
                NEW
           ^============^
        ___^_____       ^
       |   ^     |      ^
----------------------------------------
         OLD        ^    
                    |
                    |
                ALLOCATION
  • If this allocation size is smaller than any previous chunk, remove the largest chunk.
  • If this allocation size is between the largest and smallest, do not remove.

Simulated Results

Following codes can be used to simulate a multiple stage running scenario.

int thrd_share_allocator_test(int repeat, ncnn::Allocator* allocator)
{
    // Fix seed for better reproducibility.
    std::mt19937 rng(0xbadf00d);
    std::uniform_int_distribution<size_t> dist;

    std::deque<ncnn::Mat> occupied;
    int batch_size = 30;
    int batch_num = repeat;
    for (int batch_id = 0; batch_id < batch_num; ++batch_id)
    {
        // Centroid and radius absolute size does not matter.
        // You can scale them arbitrarily.
        size_t centroid = std::max(1UL, dist(rng)) % 100000000;              // max centroid is 100 MB
        size_t radius = std::min(centroid, std::min(dist(rng), 30000000UL)); // max radius is 30 MB

        std::uniform_int_distribution<size_t> size_dist(centroid - radius, centroid + radius);
        std::vector<void*> allocated;
        for (int i = 0; i < batch_size; ++i)
        {
            size_t size = size_dist(rng) & ~0x13FFF; // minimal step is 5 KB
            void* ptr = allocator->fastMalloc(size);
            allocated.push_back(ptr);
        }
        std::this_thread::sleep_for(std::chrono::milliseconds(500));
        for (void* ptr : allocated)
        {
            allocator->fastFree(ptr);
        }
    }
    return 0;
}

int test_allocator(int thrd_num = 4, int repeat = int(1e1))
{
    auto allocator = new ncnn::PoolAllocator;
    std::vector<std::future<int> > futures;
    for (int i = 0; i < thrd_num; ++i)
    {
        futures.emplace_back(std::async(thrd_share_allocator_test, repeat, allocator));
    }
    int flag = 0;
    for (auto& future : futures)
    {
        flag |= future.get();
    }
    return flag;
}

int main()
{
    return 0 || test_allocator();
}

Memory usage: (Memory usage is gathered with valgrind: valgrind --tool=massif --time-unit=ms <executable>)

image

LinHeLurking avatar Sep 07 '22 15:09 LinHeLurking

I'm not sure why old commits occur here. Actually only cee9d22 is the thing I did the magic. 😆

LinHeLurking avatar Sep 07 '22 15:09 LinHeLurking

Codecov Report

Merging #4190 (4066294) into master (479a73a) will decrease coverage by 0.18%. The diff coverage is 48.57%.

@@            Coverage Diff             @@
##           master    #4190      +/-   ##
==========================================
- Coverage   94.43%   94.25%   -0.19%     
==========================================
  Files         749      750       +1     
  Lines      179049   179405     +356     
==========================================
+ Hits       169091   169094       +3     
- Misses       9958    10311     +353     
Impacted Files Coverage Δ
src/allocator.h 87.50% <ø> (ø)
src/allocator.cpp 75.11% <47.05%> (-1.94%) :arrow_down:
src/net.cpp 65.37% <100.00%> (ø)
src/command.cpp 72.70% <0.00%> (-14.94%) :arrow_down:
src/pipeline.cpp 58.69% <0.00%> (-2.18%) :arrow_down:
src/layer/vulkan/reshape_vulkan.cpp 92.01% <0.00%> (-2.14%) :arrow_down:
src/layer/vulkan/packing_vulkan.cpp 81.70% <0.00%> (-1.88%) :arrow_down:
src/layer/vulkan/permute_vulkan.cpp 96.99% <0.00%> (-1.60%) :arrow_down:
src/layer/vulkan/reorg_vulkan.cpp 96.35% <0.00%> (-1.57%) :arrow_down:
src/layer/vulkan/pixelshuffle_vulkan.cpp 96.35% <0.00%> (-1.57%) :arrow_down:
... and 24 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

codecov-commenter avatar Sep 07 '22 15:09 codecov-commenter

please investigate sanitizer error

nihui avatar Sep 13 '22 04:09 nihui

I've tried but cannot reproduce the error of test_squeezenet. Is there any way to run CTest with extra checks on?

LinHeLurking avatar Sep 23 '22 11:09 LinHeLurking

cmake -DCMAKE_BUILD_TYPE=debug -DNCNN_ASAN=ON -DNCNN_BUILD_TESTS=ON -DNCNN_BUILD_TOOLS=OFF -DNCNN_BUILD_EXAMPLES=OFF ..

nihui avatar Sep 23 '22 22:09 nihui

I've tried but cannot reproduce the error of test_squeezenet. Is there any way to run CTest with extra checks on?

May be you can refer to this line about this file linux-x64-cpu-gcc-san.yml :)

LRY89757 avatar Sep 24 '22 01:09 LRY89757

(First of all, sorry for late response.)

It turns out that ASAN switch of CMAKE is ignored if it is set in VS Code setting.json file. (But why ???) After all, I've fixed the sanitizer error. 😆

LinHeLurking avatar Sep 26 '22 08:09 LinHeLurking

Thanks for your contribution !

nihui avatar Oct 09 '22 02:10 nihui