dd-trace-java icon indicating copy to clipboard operation
dd-trace-java copied to clipboard

Application crushes after update to 0.102.0 Datadog apm lib

Open dpavlov-smartling opened this issue 3 years ago • 3 comments

Hello,

Today during update of Datadog APM lib from 0.68 to 0.102 we have noticed java application crush. Example of coredump

# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGBUS (0x7) at pc=0x00007f95bf021582, pid=28928, tid=140270978377472
#
# JRE version: Java(TM) SE Runtime Environment (7.0_80-b15) (build 1.7.0_80-b15)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C  [libzip.so+0x4582]  newEntry+0x62
#
# Core dump written. Default location: //core or core.28928
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

While application can start with the new java class, nearly after 15 minutes it goes down. It looks like this is related to bug https://bugs.openjdk.org/browse/JDK-8145260 , but we didn't notice such a behavior on version 0.68.

Seems like such a bug can appear if Java notice jar changing during the runtime. We found that crush happens only then application is under the load. No problems like this on the application without high traffic level. Any idea what code in Datadog APM lib can cause this and in what version it was introduced.

dpavlov-smartling avatar Jun 16 '22 20:06 dpavlov-smartling

Let me start with stating that JDK 7 is very much obsolete and for all practical purposes dead. Apparently, this is a bug in JDK and unless you are paying support to Oracle the bug will not be fixed, I am afraid.

Please, consider upgrading to Java 8. It is mostly painless and, among other things, you will remove a bunch of potential attack vectors due to missing security updates for Java 7 (again, unless you are a paying Oracle customer - in which case it would be better to bring up this crash with them).

Cheers.

jbachorik avatar Jun 16 '22 21:06 jbachorik

Hello @jbachorik,

Thank you for the update. Yeah, I know that java 7 is old and outdated. But, still are there any ideas about why version 0.68 and as far as I know 0.72 don't trigger such a bug. Possible it is related to the approach that was used to fix this issue - https://github.com/DataDog/dd-trace-java/issues/1677 ? Thanks in advance.

dpavlov-smartling avatar Jun 16 '22 23:06 dpavlov-smartling

Hi @dpavlov-smartling - I doubt it's related to #1677 because we removed the need to create those temporary jar files in a subsequent release.

It could be that later versions are triggering a JDK bug which the earlier versions didn't, but there are so many changes between those releases that it would be difficult to narrow down without more information about when it starts failing for your application. You could use a bisect approach to narrow down the first release which fails (ie. pick a release halfway between the pass/fail versions, then iterate depending on the result)

mcculls avatar Jun 17 '22 04:06 mcculls

@dpavlov-smartling if this is still an issue, please open up a support ticket at https://www.datadoghq.com/support/

bm1549 avatar Nov 17 '23 22:11 bm1549