openj9 icon indicating copy to clipboard operation
openj9 copied to clipboard

OpenJ9 performance in SPECjvm2008 is lower than GraalVM/Hotspot

Open kekw333 opened this issue 6 years ago • 3 comments

kekw333 avatar Aug 15 '19 15:08 kekw333

FYI @mstoodle @vijaysun-omr @andrewcraik

DanHeidinga avatar Aug 15 '19 15:08 DanHeidinga

Thanks - I'm going to see if we can recreate the result on one of our systems so we can take a look.

andrewcraik avatar Aug 15 '19 21:08 andrewcraik

Hi @dredwardhyde,

Thank you for raising this issue. We have run the benchmark on our Intel x86 machines. And here is what we've observed. In most of the cases, we are either on par, or doing better than the alternative JVMs. There are a few startup, and the xml tests that we are a bit behind, and we are looking into them.

There are a few things that we'd like to point out, which may explain why we are seeing different pictures.

  1. In the repro steps, there was only one java process to run all the benchmarks. This could become a problem, because the profile content will change by what "has been" run. And how JIT will optimize depends on the profiling information. In the way that you were running, it will show different scores if the order of the benchmark names were altered. In our lab test harness, SPECjvm2008 is set up in a way that each benchmark will fire up a new java process, i.e. Java -jar SPECjvm2008.jar scimark.fft.large Java -jar SPECjvm2008.jar startup.crypto.rsa Java -jar SPECjvm2008.jar xml.validation etc In each test, there is "warmup" runs before the "measurement" run. This way, the results is more accurate. And the order of the tests won't matter, which is what we want to measure: how java behaves under different workload independently.

  2. We also noticed that the repro steps has only the max heap size set. Fixed heap size vs. variable heap size is always an important consideration in tuning the JVM. In variable heap size, GC will adapt heap size to keep occupancy within a certain range, which may add some overhead. In our experiment, we found that the OpenJ9 jvm performs better with a fixed heap size, i.e. setting -Xms4096m -Xmx4096m. There is a detailed performance tuning cookbook for your reference: https://publib.boulder.ibm.com/httpserv/cookbook/Java-Java_Virtual_Machines_JVMs-OpenJ9_and_IBM_J9_JVMs.html

  3. There is also the consideration of CPU setting. We can't tell the specifics of your environment. But on our machine, where there are multiple NUMA node, we had to bind to a fixed number of CPU to eliminate NUMA effect. We also limited the number of threads to mimic real world scenario (i.e. applications are normally limit to a certain number of threads, and not running on all 128 cores)

  4. For the crypto tests, we do have quite a few important bug fixes since the build that you were using was published. On my setup, I'm seeing 3x improvement on crypto.rsa, and 2X improvement on crypto.signverify in a 4-thread scenario respectively, from jdk8u222-b10-openj9 to jdk8u242-b08-openj9 (the latest version). We would strongly suggest to use the latest OpenJ9 jdk for measurements.

That's our analysis so far. Please let us know if you have any questions.

BeverlyXu avatar Feb 06 '20 17:02 BeverlyXu