ncnn
ncnn copied to clipboard
Memory Pool Improvement For Variadic Sized Inputs
A Better Memory Pool With Simple Greedy Strategy
This pull request contains several modifications on current ncnn pool allocators.
Basic Assumption
The program runs in multiple stages. In each stage, memory sizes the program is asking for are lied in an (relatively small) interval. When program goes into a new stage, the size interval will change. The intervals may or may not overlap.
Double Ended Greedy Strategy
If all cached chunks in the pool cannot satisfy current allocation, the allocator tries to remove an outdated chunk.
- If this allocation size is larger than any previous chunk, remove the smallest chunk (see below).
NEW
^============^
___^_____ ^
| ^ | ^
----------------------------------------
OLD ^
|
|
ALLOCATION
- If this allocation size is smaller than any previous chunk, remove the largest chunk.
- If this allocation size is between the largest and smallest, do not remove.
Simulated Results
Following codes can be used to simulate a multiple stage running scenario.
int thrd_share_allocator_test(int repeat, ncnn::Allocator* allocator)
{
// Fix seed for better reproducibility.
std::mt19937 rng(0xbadf00d);
std::uniform_int_distribution<size_t> dist;
std::deque<ncnn::Mat> occupied;
int batch_size = 30;
int batch_num = repeat;
for (int batch_id = 0; batch_id < batch_num; ++batch_id)
{
// Centroid and radius absolute size does not matter.
// You can scale them arbitrarily.
size_t centroid = std::max(1UL, dist(rng)) % 100000000; // max centroid is 100 MB
size_t radius = std::min(centroid, std::min(dist(rng), 30000000UL)); // max radius is 30 MB
std::uniform_int_distribution<size_t> size_dist(centroid - radius, centroid + radius);
std::vector<void*> allocated;
for (int i = 0; i < batch_size; ++i)
{
size_t size = size_dist(rng) & ~0x13FFF; // minimal step is 5 KB
void* ptr = allocator->fastMalloc(size);
allocated.push_back(ptr);
}
std::this_thread::sleep_for(std::chrono::milliseconds(500));
for (void* ptr : allocated)
{
allocator->fastFree(ptr);
}
}
return 0;
}
int test_allocator(int thrd_num = 4, int repeat = int(1e1))
{
auto allocator = new ncnn::PoolAllocator;
std::vector<std::future<int> > futures;
for (int i = 0; i < thrd_num; ++i)
{
futures.emplace_back(std::async(thrd_share_allocator_test, repeat, allocator));
}
int flag = 0;
for (auto& future : futures)
{
flag |= future.get();
}
return flag;
}
int main()
{
return 0 || test_allocator();
}
Memory usage:
(Memory usage is gathered with valgrind: valgrind --tool=massif --time-unit=ms <executable>
)
I'm not sure why old commits occur here. Actually only cee9d22 is the thing I did the magic. 😆
Codecov Report
Merging #4190 (4066294) into master (479a73a) will decrease coverage by
0.18%
. The diff coverage is48.57%
.
@@ Coverage Diff @@
## master #4190 +/- ##
==========================================
- Coverage 94.43% 94.25% -0.19%
==========================================
Files 749 750 +1
Lines 179049 179405 +356
==========================================
+ Hits 169091 169094 +3
- Misses 9958 10311 +353
Impacted Files | Coverage Δ | |
---|---|---|
src/allocator.h | 87.50% <ø> (ø) |
|
src/allocator.cpp | 75.11% <47.05%> (-1.94%) |
:arrow_down: |
src/net.cpp | 65.37% <100.00%> (ø) |
|
src/command.cpp | 72.70% <0.00%> (-14.94%) |
:arrow_down: |
src/pipeline.cpp | 58.69% <0.00%> (-2.18%) |
:arrow_down: |
src/layer/vulkan/reshape_vulkan.cpp | 92.01% <0.00%> (-2.14%) |
:arrow_down: |
src/layer/vulkan/packing_vulkan.cpp | 81.70% <0.00%> (-1.88%) |
:arrow_down: |
src/layer/vulkan/permute_vulkan.cpp | 96.99% <0.00%> (-1.60%) |
:arrow_down: |
src/layer/vulkan/reorg_vulkan.cpp | 96.35% <0.00%> (-1.57%) |
:arrow_down: |
src/layer/vulkan/pixelshuffle_vulkan.cpp | 96.35% <0.00%> (-1.57%) |
:arrow_down: |
... and 24 more |
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.
please investigate sanitizer error
I've tried but cannot reproduce the error of test_squeezenet
. Is there any way to run CTest with extra checks on?
cmake -DCMAKE_BUILD_TYPE=debug -DNCNN_ASAN=ON -DNCNN_BUILD_TESTS=ON -DNCNN_BUILD_TOOLS=OFF -DNCNN_BUILD_EXAMPLES=OFF ..
I've tried but cannot reproduce the error of
test_squeezenet
. Is there any way to run CTest with extra checks on?
May be you can refer to this line about this file linux-x64-cpu-gcc-san.yml :)
(First of all, sorry for late response.)
It turns out that ASAN switch of CMAKE is ignored if it is set in VS Code setting.json
file. (But why ???)
After all, I've fixed the sanitizer error. 😆
Thanks for your contribution !