orientdb
orientdb copied to clipboard
3.1.x WAL writes much more to disk than 3.0.x
OrientDB Version: 3.1.7
Java Version: 1.11
OS: Linux
Expected behavior
WAL writes and OrientDB write throughput is similar or better in 3.1.x than in 3.0.x
Actual behavior
Sending the same data to 3.0.x and 3.1.x results in a 5x slowness. I've been debugging the code and measuring performance and it looks like the amount of data written to the WAL in 3.1.x is much bigger than in 3.0.x
Steps to reproduce
Send the same data to a 3.0.x and a 3.1.x OrientDB servers. In my use case I'm upserting a high volume of data to a graph that contains around 500k vertex and 4 million edges
Hi @madmac2501 , a big surprise for me. Will check this of course. Can you provide any benchmark which I can use to measure the efficiency of the fix?
Sorry but I'm testing with customer data and I cannot share the data. But the overall numbers are a graph that contains around 500k vertex and 4 million edges. I can share the schema with the classes and the indexes privately.
Please send me an email to [email protected] and I can share more info privately.
Thanks a lot.
Any chance that you built 3.1 latest snapshot and try?
sure, I'll test latest version from git.
"Sorry but I'm testing with customer data and I cannot share the data." - I mean is it possible for you to create a minimal artificial benchmark to reproduce the issue? I have benchmarked both versions. 3.1. a bit slower, but only on a small number of percents not at 5 times.
I got the same performance results with 3.1.x with commit 342dedeab8b86f2e7ef86bccc4a681f481417484.
I'll try to build a synthetic benchmark but it will take some time. I'll come back when I reproduce the performance with this new benchmark.
Thanks a lot @laa
I've been trying to build a synthetic benchmark but I still have not found a clear dataset to reproduce it. Still working on it.
Hi @madmac2501 , Could you do the following then. Could you setup https://github.com/jvm-profiling-tools/async-profiler . Then could you perform your benchmark on production data using both 3.1 and 3.0 versions and send results back to me.