astl
astl copied to clipboard
Improve performance of HashBuckets using native tuing
Outline
Set Containers
Method | HashSet | LightSet | Set |
---|---|---|---|
insert | 223 ms | 207 ms | 35 ms |
has | 563 ms | 555 ms | 307 ms |
iteration | 4 ms | 32 ms | 17 ms |
erase | 114 ms | 96 ms | 36 ms |
Map Containers
Method | HashMap | LightMap | Map |
---|---|---|---|
insert | 282 ms | 229 ms | 37 ms |
has | 611 ms | 587 ms | 319 ms |
iteration | 16 ms | 36 ms | 17 ms |
erase | 123 ms | 102 ms | 35 ms |
Comparing elapsed time with my ASTL hash containers vs. AssemblyScript native hash containers, the AssemblyScript native hash containers are faster than of mine. Although my ASTL hash containers are based on the doubly-linked-list to providing much more features than the AssemblyScript, there're two things that can adapt from the AssemblyScript to enhance the performance.
Performance tuning
The first way is to using not the mod (%
) operator but the bitwise and (&
) operator in the hash buckets. If size of the hash buckets always be multiple of 2, the ^
operation can replace the %
operation and it may lead my ASTL container to be about two times faster.
Modulo and Division vs Bitwise Operations
https://mziccard.me/2015/05/08/modulo-and-division-vs-bitwise-operations/
The second way is to using the manual management system of the AssemblyScript. With the @unmanaged
tag who can deny automated memory management by the garbage collector like the AssemblyScript's implementation code, the performance would be enough enhanced like the AssemblyScript.
assemlyscript/std/Map.ts
https://github.com/AssemblyScript/assemblyscript/blob/master/std/assembly/map.ts
Set Containers
Method | HashSet | LightSet | Set |
---|---|---|---|
insert | 203 ms | 175 ms | 34 ms |
has | 567 ms | 548 ms | 322 ms |
iteration | 4 ms | 29 ms | 16 ms |
erase | 111 ms | 90 ms | 33 ms |
Map Containers
Method | HashMap | LightMap | Map |
---|---|---|---|
insert | 263 ms | 188 ms | 32 ms |
has | 636 ms | 585 ms | 321 ms |
iteration | 6 ms | 28 ms | 15 ms |
erase | 122 ms | 98 ms | 32 ms |
After adapting the first method using &
operator instead of the %
, the performance has been improved a little bit.
I'm just wondering how Map#has
and Set#has
in 10 times slower than Map#set
and Map#delete
due to all this method use the same lookup method and Map#set / Map#delete
also do more extra work? Or has
benchmark use much larger iterations?
Upd: it seems yes. Map#has
executed in 10 times frequent than other benches
Btw "iteration" benchmark is not the same as for HashMap
/ LightMap
:
{
const keys: string[] = native.keys();
for (let i: i32 = 0; i < keys.length; ++i)
native.get(keys[i]);
}
It should be:
{
const keys: string[] = native.keys();
for (let i: i32 = 0, len = keys.length; i < len; ++i) {
keys[i];
}
}
or
{
const values = native.values();
for (let i: i32 = 0, len = values.length; i < len; ++i) {
values[i];
}
}
@MaxGraey After editing benchmark code following of yours, elapsed time has been changed like below:
Set Containers
Method | HashSet | LightSet | Set |
---|---|---|---|
insert | 190 ms | 171 ms | 36 ms |
has | 556 ms | 558 ms | 327 ms |
iteration | 4 ms | 27 ms | 5 ms |
erase | 108 ms | 89 ms | 34 ms |
Map Containers
Method | HashMap | LightMap | Map |
---|---|---|---|
insert | 226 ms | 181 ms | 36 ms |
has | 613 ms | 579 ms | 334 ms |
iteration | 12 ms | 32 ms | 6 ms |
erase | 117 ms | 96 ms | 32 ms |
in rehash this part:
const log: f64 = Math.log2(<f64>length);
if (log !== Math.floor(log))
length = <usize>(Math.pow(2, Math.ceil(log)));
could be improved as:
length = 1 << (32 - clz(length - 1)); // align length as 2 ** ceil(log2(len))
or (if length 64-bit):
length = (<u64>1) << (64 - clz<u64>(length - 1));