jerryscript icon indicating copy to clipboard operation
jerryscript copied to clipboard

literal storage: 4x performance improvement using a hashmap

Open ronanj opened this issue 1 year ago • 8 comments
trafficstars

This PR is used to improve the literal storage performance:

Currently, literals are stored using a linked list: The drawback of the linked list is that when the number of literal entries increases, the time it takes to add a new entry increases proportionally to the number of entries: The reason is that before adding new entries, the storage engine first needs to iterate over all existing entries to find out if the "new" entry is a duplicate of an existing one or a new entity.

When running on a 200 Mhz CPU with a heap stored in PSRAM, if 1000 entries are already in the literal storage linked list, then it takes almost 1 millisecond to add a new entry (eg 1 millisecond to iterate over all existing 1000 entries). This makes loading snapshots very slow, especially when many background modules have already been loaded (where each background module adds its literals). Empirically, we measured that it takes 200 milliseconds for jerry_exec_snapshot to load and execute a snapshot containing about 200 literals when 1000 literals are already in the list before loading (measured on 200Mhz CPU with heap in PSRAM).

The change introduces an additional hashmap to speed up the look-up of existing literals: This way, it is not necessary to iterate through the entire literal linked list (in O(n)) - and instead, an O(1) lookup operation can be used. Empirically, we have noticed that the application that took 200 milliseconds to load now takes 50 ms (measured from jerry_exec_snapshot), bringing a 4 times performance improvement.

This feature can be enabled using the flag JERRY_LIT_HASHMAP. The test configuration has been updated to include this feature flag. Unit and functional tests have successfully passed. Test has also been done on the Open Harmony Jerry script version.

JerryScript-DCO-1.0-Signed-off-by: Ronan Jezequel [email protected]

ronanj avatar Mar 20 '24 07:03 ronanj

ok, let me rewrite the hashmap library, and at the same time use the standard comment syntax from jerryscript.

ronanj avatar Mar 22 '24 06:03 ronanj

New commit added, with hashmap completely written from scratch - and also with better performance.

ronanj avatar Apr 29 '24 08:04 ronanj

Looks like a nice patch! We check it further soon

zherczeg avatar Apr 30 '24 10:04 zherczeg

It looks like tests/jerry/arithmetics.js fails. Do you know why?

zherczeg avatar May 03 '24 09:05 zherczeg

@ronanj I apologize for the delay, the CI issues for OSX have been resolved, please squash and rebase your changes, and add DCO to the commit message.

matetokodi avatar Jun 03 '24 11:06 matetokodi

Close and reopen to re-trigger the CI

LaszloLango avatar Nov 13 '24 11:11 LaszloLango

@ronanj could you please rebase the PR? It will make the CI green (probably).

LaszloLango avatar Nov 13 '24 19:11 LaszloLango

I have some idea about this, currently, the ecma_string_t is super complicated and have many issue about that, I'd like to take the idea Atom from quickjs to implement string and string in jerryscript, and Atom is Hash-Table based, so the performance issue addressed by this MR will also be resolved.

lygstate avatar Nov 28 '24 12:11 lygstate