rv32emu icon indicating copy to clipboard operation
rv32emu copied to clipboard

Improve red-black tree implementation in map

Open jserv opened this issue 3 years ago • 3 comments

map.[ch] is the C Implementation for C++ std::map using red-black tree. It works well, but there is room for improvements. We can take some tricks used in Red-black Trees in Linux for potential speedup. In addition, current rbtree can benefit from the introduction of indirect pointers.

jserv avatar Aug 02 '22 21:08 jserv

commit 7fb4ff754e48f09e21f0e21137fd28ce59361de5 integrates color into lowest bit of parent pointer.

jserv avatar Aug 22 '22 07:08 jserv

The expected improvements can be the followings:

  • Combining the color bit with the parent node pointer This is done by alignment, since map_node_t is defined to align with unsigned long (8 Bytes). The least three bits would be left unused, so we can integrate the color bit to the LSB.

  • Indirect pointers Indirect pointers can be used to decrease the number of branches to increase better performance in pipelines.

  • Embed map_node_t into structures for better cache locality Since the pointer we use to manipulate are embedded in the structure variable we are using, there would be one less layer of indirection and resulting to better cache locality.

steven1lung avatar Sep 26 '22 11:09 steven1lung

  • Indirect pointers Indirect pointers can be used to decrease the number of branches to increase better performance in pipelines.

commit ff46dad7571d3d62852ca4b8722804c8324835bb and commit 09f4e653627ebb2bec188fff504e95b91a42fac9 lower the number of branches in red-black tree.

jserv avatar Sep 27 '22 04:09 jserv

rbi is non-recursive, and has very little memory overhead (e.g. no parent pointer within the red-black tree node). Its features:

  • 3 tree implementations: unordered; ordered but unbalanced; balanced
  • memory allocation is done outside the library
  • lookup, iteration and insertion is without recursion and with constant memory usage
  • compact memory representation: parent pointers are not used
  • compact memory representation: option to store the red-black bit of the balanced implementation in the least significant bit of the (right) pointer
  • operations other than lookup, insertion and iteration are not implemented

We might learn from rbi.

jserv avatar Nov 03 '22 19:11 jserv

Below is my proposal for the first change.

  • store the red-black bit of the balanced implementation in the least significant bit of the (right) pointer
  • Removing parent pointes and rewrite corresponding functions

I will also write testing to ensure the correctness of all functions.

What do you think?

TheCloudlet avatar Feb 21 '23 15:02 TheCloudlet

The following result is recorded by the simple benchmark I created. By inserting, searching, and deleting 10e6 random and not duplicated integers.

action type map rb
insert (sec) 1.123425 0.446589
search (sec) 1.000282 0.406894
delete (sec) 0.173655 0.44963

TheCloudlet avatar Mar 17 '23 13:03 TheCloudlet

The following result is recorded by the simple benchmark I created. By inserting, searching, and deleting 10e6 random and not duplicated integers.

You shall provide the corresponding benchmark programs for reference purpose.

In particular, rbi.c lacks of the ability to remove. rb.h is exactly used for comparison.

jserv avatar Mar 17 '23 14:03 jserv

You shall provide the corresponding benchmark programs for reference purpose.

Repository: rbtree_bench

  • [x] 1. mention the machine specifications which you performed the experiments on.
  • [x] 2. provide a top-level CMakeLists.txt that describes the way to build both map and rb;
  • [x] 3. pull Google Tests automatically. See Quickstart: Building with CMake; and
  • [x] 4. rename the repository as rbtree-bench. There is no separation between rb and tree.

TheCloudlet avatar Mar 17 '23 15:03 TheCloudlet

  • [x] 2. provide a top-level CMakeLists.txt that describes the way to build both map and rb;
  • [x] 3. pull Google Tests automatically. See Quickstart: Building with CMake; and

The following changes are required.

--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -1,4 +1,4 @@
-cmake_minimum_required (VERSION 2.8.11)
+cmake_minimum_required (VERSION 3.14)
 
 project (rbtree_bench)
 
@@ -10,6 +10,7 @@ FetchContent_Declare(
   googletest
   URL https://github.com/google/googletest/archive/03597a01ee50ed33e9dfd640b249b4be3799d395.zip
 )
+FetchContent_MakeAvailable(googletest)
 
 add_subdirectory (map)
 add_subdirectory (rb)
diff --git a/map/CMakeLists.txt b/map/CMakeLists.txt
index 2701980..a67cbb5 100644
--- a/map/CMakeLists.txt
+++ b/map/CMakeLists.txt
@@ -1,4 +1,3 @@
-cmake_minimum_required(VERSION 3.5)
 project(map)
 
 enable_testing()
diff --git a/rb/CMakeLists.txt b/rb/CMakeLists.txt
index 06ab4d7..8bdf88d 100644
--- a/rb/CMakeLists.txt
+++ b/rb/CMakeLists.txt
@@ -1,10 +1,7 @@
-cmake_minimum_required(VERSION 3.5)
 project(rbi)
 
 enable_testing()
 
-find_package(GTest CONFIG REQUIRED)
-
 set (CMAKE_CXX_STANDARD 11)
 set(CMAKE_EXPORT_COMPILE_COMMANDS ON)
 set(GCC_FLAGS "-s -Os -W -Wall -Werror")

jserv avatar Mar 17 '23 16:03 jserv

The following result is recorded by the simple benchmark I created. By inserting, searching, and deleting 10e6 random and not duplicated integers.

Tested on eMag 8180, 64-bit 32-core Arm server.

action type map rb
insert (sec) 2.341151 1.11663
search (sec) 1.723217 0.851424
remove (sec) 0.352333 1.13698

jserv avatar Mar 17 '23 16:03 jserv

Action items for rbtree_bench:

  • [x] Unify the interface for accessing different rbtree implementations. You may wrap rb.h to fit the functions exposed by rv32emu's map.h.
  • [x] Shrink rb.h by replacing it with much smaller one I provided. --> Done by commit
  • [x] Utilize minibench for benchmarking -- simpler and more informative. --> Done by commit1 and commit2
  • [x] Check existing rbtree benchmark programs and see if we can learn from the test scenarios. e.g., rbt_test.c from bind9 project and rb-bench.
  • [x] Not only average time but also worst-case behavior should be considered.
  • [x] Measure memory consumptions -- check Massif from Valgrind.

TheCloudlet avatar Mar 20 '23 13:03 TheCloudlet

Close via commit 434c466

jserv avatar Jun 14 '23 02:06 jserv