orientdb icon indicating copy to clipboard operation
orientdb copied to clipboard

3.1.x WAL writes much more to disk than 3.0.x

Open madmac2501 opened this issue 4 years ago • 8 comments

OrientDB Version: 3.1.7

Java Version: 1.11

OS: Linux

Expected behavior

WAL writes and OrientDB write throughput is similar or better in 3.1.x than in 3.0.x

Actual behavior

Sending the same data to 3.0.x and 3.1.x results in a 5x slowness. I've been debugging the code and measuring performance and it looks like the amount of data written to the WAL in 3.1.x is much bigger than in 3.0.x

Steps to reproduce

Send the same data to a 3.0.x and a 3.1.x OrientDB servers. In my use case I'm upserting a high volume of data to a graph that contains around 500k vertex and 4 million edges

madmac2501 avatar Jan 28 '21 12:01 madmac2501

Hi @madmac2501 , a big surprise for me. Will check this of course. Can you provide any benchmark which I can use to measure the efficiency of the fix?

andrii0lomakin avatar Jan 28 '21 14:01 andrii0lomakin

Sorry but I'm testing with customer data and I cannot share the data. But the overall numbers are a graph that contains around 500k vertex and 4 million edges. I can share the schema with the classes and the indexes privately.

Please send me an email to [email protected] and I can share more info privately.

Thanks a lot.

madmac2501 avatar Jan 28 '21 15:01 madmac2501

Any chance that you built 3.1 latest snapshot and try?

andrii0lomakin avatar Jan 28 '21 17:01 andrii0lomakin

sure, I'll test latest version from git.

madmac2501 avatar Jan 28 '21 17:01 madmac2501

"Sorry but I'm testing with customer data and I cannot share the data." - I mean is it possible for you to create a minimal artificial benchmark to reproduce the issue? I have benchmarked both versions. 3.1. a bit slower, but only on a small number of percents not at 5 times.

andrii0lomakin avatar Jan 29 '21 06:01 andrii0lomakin

I got the same performance results with 3.1.x with commit 342dedeab8b86f2e7ef86bccc4a681f481417484.

I'll try to build a synthetic benchmark but it will take some time. I'll come back when I reproduce the performance with this new benchmark.

Thanks a lot @laa

madmac2501 avatar Jan 29 '21 09:01 madmac2501

I've been trying to build a synthetic benchmark but I still have not found a clear dataset to reproduce it. Still working on it.

madmac2501 avatar Feb 23 '21 09:02 madmac2501

Hi @madmac2501 , Could you do the following then. Could you setup https://github.com/jvm-profiling-tools/async-profiler . Then could you perform your benchmark on production data using both 3.1 and 3.0 versions and send results back to me.

andrii0lomakin avatar Mar 01 '21 11:03 andrii0lomakin