arcadedb icon indicating copy to clipboard operation
arcadedb copied to clipboard

Massive insertion (multiple threads/buckets + non unique index) leads to timeout/excessive retries

Open lvca opened this issue 6 months ago • 3 comments

From Discord Channel (user Kolja):

I wrote a small stress test (in Kotlin though) that reproduces the problem a 100% of the time on my machine: https://gist.github.com/koljakube/73062a48afe29ea6fff8d6a55b2a3d2d

lvca avatar Jun 14 '25 06:06 lvca

I was able to run the test twice. The 1st time went in out of memory, so I assigned 16GB and it was running up to 75K of 100K cycles before having issues with timeout locking timeout, therefore retries.

I'm running the test again removing the encryption and the

   //System.setProperty("arcadedb.profile", "high-performance");

Which in your case get things worse because you're not using the async api and the high-performance put your cores constantly at 100% for no reason (I'll have to check that later if we can enable only when async is used) I'll run it also with YourKit on to see the amount of conflict between threads. You selected ThreadBucketSelectionStrategy, so it should truly run in parallel

lvca avatar Jun 14 '25 06:06 lvca

Found an issue with keeping page in RAM when total record in page > max records in page. This caused an excessive slow down of massive insertion on looking for new space to allocate from pages.

lvca avatar Jun 14 '25 14:06 lvca

After running some tests I can confirm the issue is in the index compaction, still working on it.

lvca avatar Jun 16 '25 00:06 lvca

This issue has been fixed. This test now runs without problems, even forcing compaction at every page.

...
99998/100000 batches completed in 1ms (total 89s) cacheRAM=50.19MB/150.00MB
99999/100000 batches completed in 0ms (total 89s) cacheRAM=50.19MB/150.00MB

Inserting took 89904 ms
There are now 10000000 entries in the database.
[identifier, sum]
a: 3023682806206853
[identifier, sum]
b: 3025820864196671
Query took 17094 ms

Manual aggregation result: {a=3023682806206853, b=3025820864196671}
Manual aggregation took 21754 ms

Process finished with exit code 0

I converted your stress test in Java: https://github.com/ArcadeData/arcadedb/blob/356a9b12d89449d8c706637a0c06ae08c3ec373c/engine/src/test/java/performance/PerformanceInsertMTStressTest.java#L26

lvca avatar Jul 01 '25 12:07 lvca