Performance of deallocation code paths - tree bitset

Open r1viollet opened this issue 2 years ago • 0 comments

What does this PR do?

Implement a radix tree strategy using bitsets

Motivation

Ensure we are accurate in the way we account for addresses. Check @richardstartin 's idea

Results

Performance numbers look great. With the reader thread, some samples are costing CPU

BM_ShortLived_NoTracking/process_time/real_time     866340 ns      3245594 ns          758
BM_ShortLived_Tracking/process_time/real_time       999061 ns      4036038 ns          613
BM_LongLived_NoTracking/process_time                340981 ns       679761 ns         1074
BM_LongLived_Tracking/process_time                  376116 ns      1359466 ns          548

Without the reader thread:

BM_ShortLived_NoTracking/process_time/real_time     471468 ns       987956 ns         1366
BM_ShortLived_Tracking/process_time/real_time       502947 ns      1171483 ns         1000
BM_LongLived_NoTracking/process_time                338881 ns       185030 ns         4789
BM_LongLived_Tracking/process_time                  330957 ns       281179 ns         2535

Sep 18 '23 10:09 r1viollet