corretto-11 icon indicating copy to clipboard operation
corretto-11 copied to clipboard

Garbage Collector G1 failes with EXCEPTION_ACCESS_VIOLATION (0xc0000005)

Open 9sokolov opened this issue 4 years ago • 6 comments

Garbage Collector G1 fails with EXCEPTION_ACCESS_VIOLATION (0xc0000005)

How to reproduce: It is reproducible when you are trying to create a lot of small short cycle living objects. We are facing it during syntax analysis.

Additional Details: This is reproducible on Windows 10 very often, much rare for Linux. Using OpenJDK11GA it is hardly reproduciblle. Also when G1 GC used it required 10-15% more RAM than ParallelGC

System details: openjdk version "11.0.7" 2020-04-14 LTS OpenJDK Runtime Environment Corretto-11.0.7.10.1 (build 11.0.7+10-LTS) OpenJDK 64-Bit Server VM Corretto-11.0.7.10.1 (build 11.0.7+10-LTS, mixed mode)

NOTE: ParallelGC garbage collector works fine and if you will reduce memory it will fail with OOM exception, but when G1 used - it will always fail with EXCEPTION_ACCESS_VIOLATION

Suggestion: Plese set ParallelGC as default GC in Java 11 Corretto.

Please see error log in attach. A fatal error has been detected by the Java Runtime Environment:

SIGSEGV (0xb) at pc=0x00007ff994bad16f, pid=7402, tid=7403

JRE version: OpenJDK Runtime Environment (11.0.7+10) (build 11.0.7+10-LTS) Java VM: OpenJDK 64-Bit Server VM (11.0.7+10-LTS, mixed mode, tiered, compressed oops, g1 gc, linux-amd64) Problematic frame: J 19716 c2 java.util.HashMap.get(Ljava/lang/Object;)Ljava/lang/Object; [email protected] (23 bytes) @ 0x00007ff994bad16f [0x00007ff994baca00+0x000000000000076f] hs_err_pid1488.log

9sokolov avatar May 06 '20 21:05 9sokolov

Thank you for reporting this issue.

Could you provide a sample code we could use to reproduce this issue?.

cliveverghese avatar May 07 '20 00:05 cliveverghese

Looking at the hs_err file, I found the following:

The exception happens at PC 0x0000021e29ea8420:

#  EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x0000021e29ea8420, pid=1488, tid=9492

while reading address 0x00000000200ffea8:

siginfo: EXCEPTION_ACCESS_VIOLATION (0xc0000005), reading address 0x00000000200ffea8

Looking at the "Register to memory mapping" we can see that RBP is a compressed oop which is pointing into a valid class in the classspace:

RBP=537919132=0x200ffe9c is a compressed pointer to class: 0x00000001007ff4e0
org.antlr.v4.runtime.atn.ATNConfig {0x00000001007ff4e8}

and we see that faulting address 0x00000000200ffea8 corresponds to "RBP + 0xc"

When dissassembling the instructions at 0x0000021e29ea8420 we get:

0x0000021e29ea8420: 44 8b 45 0c                      mov    r8d,DWORD PTR [rbp+0xc]

which confirms that we are loading from "RBP + 0xc".

Further down in the assembly we find:

0x0000021e29ea84c3: 44 8b 4d 18                      mov    r9d,DWORD PTR [rbp+0x18]
0x0000021e29ea84c7: 41 85 02                         test   DWORD PTR [r10],eax
0x0000021e29ea84ca: 45 85 c9                         test   r9d,r9d
0x0000021e29ea84cd: 0f 84 a3 eb ff ff                je     0x21e29ea7076
0x0000021e29ea84d3: 49 8b e9                         mov    rbp,r9
0x0000021e29ea84d6: e9 45 ff ff ff                   jmp    0x21e29ea8420

What happens here is that we load from RBP+0x18, and if the result is not zero, store it back in RBP. After that we jump back to 0x0000021e29ea8420 where we want to load from "RBP + 0xc" but crash because RBP is a compressed class pointer.

So the problem is that we try loading from a compressed class pointer in C2-compiled code without "decompressing" the pointer. Why this happens is not clear to me yet.

@9sokolov could you ideally provide the My.jar file you've used to reproduce the problem? This would greatly help us in finding the root cause of the problem. If that's not possible, could you please try to reproduce the problem with "-XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining" and send us that output together with the exact version of org/antlr/v4/runtime/atn/ATNConfig used by your application. But as mentioned before, getting My.jar would be really preferred :)

simonis avatar May 08 '20 18:05 simonis

We've provided all jar separately. Hope it will help.

9sokolov avatar May 08 '20 19:05 9sokolov

Is there any progress on fixing this issue?

I have observed it while verifying if https://github.com/aws/aws-iot-device-sdk-java-v2/issues/134 has already been fixed.

volphy avatar May 07 '21 20:05 volphy

Is there a known workaround (i.e. correct configuration for ParallelGC)?

I have tried enforcing ParallelGC in the IntelliJ config Java class invocation:

D:\Tools\amazon-corretto-11.0.11.9.1-windows-x64-jdk\jdk11.0.11_9\bin\java.exe -XX:+UseParallelGC -javaagent:D:\Tools\ideaIU-211.6556.6.win\lib\idea_rt.jar=57523:D:\Tools\ideaIU-211.6556.6.win\bin -Dfile.encoding=UTF-8 -classpath D:\git\krzwil\mqtt-check-java-v2\target\classes;C:\Users\krzwil\.m2\repository\software\amazon\awssdk\iotdevicesdk\aws-iot-device-sdk\1.3.1\aws-iot-device-sdk-1.3.1.jar;C:\Users\krzwil\.m2\repository\software\amazon\awssdk\crt\aws-crt\0.12.4\aws-crt-0.12.4.jar;C:\Users\krzwil\.m2\repository\com\google\code\gson\gson\2.8.5\gson-2.8.5.jar;C:\Users\krzwil\.m2\repository\com\fasterxml\jackson\core\jackson-core\2.12.0\jackson-core-2.12.0.jar;C:\Users\krzwil\.m2\repository\software\amazon\awssdk\auth\2.16.58\auth-2.16.58.jar;C:\Users\krzwil\.m2\repository\software\amazon\awssdk\annotations\2.16.58\annotations-2.16.58.jar;C:\Users\krzwil\.m2\repository\software\amazon\awssdk\utils\2.16.58\utils-2.16.58.jar;C:\Users\krzwil\.m2\repository\org\reactivestreams\reactive-streams\1.0.3\reactive-streams-1.0.3.jar;C:\Users\krzwil\.m2\repository\org\slf4j\slf4j-api\1.7.30\slf4j-api-1.7.30.jar;C:\Users\krzwil\.m2\repository\software\amazon\awssdk\sdk-core\2.16.58\sdk-core-2.16.58.jar;C:\Users\krzwil\.m2\repository\software\amazon\awssdk\metrics-spi\2.16.58\metrics-spi-2.16.58.jar;C:\Users\krzwil\.m2\repository\software\amazon\awssdk\regions\2.16.58\regions-2.16.58.jar;C:\Users\krzwil\.m2\repository\com\fasterxml\jackson\core\jackson-annotations\2.12.3\jackson-annotations-2.12.3.jar;C:\Users\krzwil\.m2\repository\software\amazon\awssdk\profiles\2.16.58\profiles-2.16.58.jar;C:\Users\krzwil\.m2\repository\software\amazon\awssdk\http-client-spi\2.16.58\http-client-spi-2.16.58.jar;C:\Users\krzwil\.m2\repository\com\fasterxml\jackson\core\jackson-databind\2.12.1\jackson-databind-2.12.1.jar;C:\Users\krzwil\.m2\repository\software\amazon\eventstream\eventstream\1.0.1\eventstream-1.0.1.jar MqttCheck

but the same ERROR_ACCESS_VIOLATION has been detected.

volphy avatar May 07 '21 20:05 volphy

This issue has also been encountered in corretto-17#28. Closing that issue as duplicate for now and tracking here.

benty-amzn avatar Apr 29 '22 22:04 benty-amzn