dd-trace-java icon indicating copy to clipboard operation
dd-trace-java copied to clipboard

Instrumentation test error visibility

Open deejgregor opened this issue 1 month ago • 2 comments

What Does This Do

This improves visibility of errors in instrumentation tests built on top of InstrumentationSpecification by failing the test with details of the errors, in particular these cases:

  1. Instrumentation blocked due to MuzzleCheck errors.
  2. Throwables caught from instrumentation code and reported to InstrumentationErrors by the Byte Buddy ExceptionHandlers class.
  3. Byte Buddy nstrumentation errors reported to InstrumentationSpecification's AgentBuilder.Listener.onError.

Also, ListWriterAssert.assertTraces will now fail fast when the errors from the first two cases above are reported.

Lastly, the code comment in ExceptionHandlers has been updated to reflect the current implementation and InstrumentationContext exceptions are more explicit about the likely causes of why the methods might not have been rewritten.

Motivation

These changes should reduce discovery and investigation time during development when there are instrumentation errors. Every single one of these issues are ones that I've run while developing instrumentations (most of them I've run into multiple times). ;-) These changes have helped me out enough (particularly for the first two use cases above) that I realized I really should see about contributing this back to help others--especially for those new to developing instrumentation, but also to assist those that have been doing it awhile when the inevitable error slips through.

Previously, the first error case above was not checked in tests using InstrumentationSpecification, and was only visible in log output, and the other two error cases just included error counts. In all of these cases, the developer would have to go digging in log output to find existence and/or details of errors.

When ListWriterAssert.assertTraces was used previously, if there were instrumentation errors that led to traces not being properly generated, tests will often fail due to failed assertions (missing spans, incorrect tags, etc.) and the developer might spend significant time chasing down the cause of the assertion failures to learn it was an instrumentation error. With these changes, if the cause was an underlying instrumentation error, ListWriterAssert.assertTraces will fail fast for the first two use cases above and clearly indicate there was an instrumentation error.

These changes will not only help our human developers by giving better context when there are failures, but also when using AI coding assistants.

Additional Notes

One possible improvement: The fail fast change could easily apply to all of the use cases above including the last use case if InstrumentationSpecification's AgentBuilder.Listener.onError recorded the error to InstrumentationErrors.

Best reviewed commit-by-commit:

Example output

InstrumentationErrors and InstrumentationContext exception

I commented-out the contextStore method on HikariPoolInstrumentation.

Condition not satisfied:

instrumentationErrorCount == 0
|                         |
1                         false

1 instrumentation errors were seen:
java.lang.RuntimeException: Calls to this method will be rewritten by Instrumentation Context Provider (e.g. FieldBackedProvider). If you get this exception, this method has not been rewritten. Ensure instrumentation class has a contextStore method and the call to InstrumentationContext.get happens directly in an instrumentation Advice class. See how_instrumentations_work.md for details.
	at datadog.trace.bootstrap.InstrumentationContext.get(InstrumentationContext.java:26)
	at com.zaxxer.hikari.pool.HikariPool.<init>(HikariPool.java:151)
	at com.zaxxer.hikari.HikariDataSource.<init>(HikariDataSource.java:73)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at org.codehaus.groovy.reflection.CachedConstructor.invoke(CachedConstructor.java:72)

Contributor Checklist

Jira ticket: [PROJ-IDENT]

deejgregor avatar Nov 16 '25 07:11 deejgregor

@AlexeyKuznetsov-DD thank you for the feedback. I'll work on the updates in the next day or two.

deejgregor avatar Nov 18 '25 08:11 deejgregor

@AlexeyKuznetsov-DD I made the recommended changes, rebased on latest master and force pushed.

I made one additional update: I realized there could be a case where the assertions in ListWriterAssert would not be checked: if the expected number of traces were received within 1 second. I refactored the asserts out into a new method, assertNoErrors, and used that both inside the for loop and once after to make sure it runs at least once.

deejgregor avatar Nov 19 '25 04:11 deejgregor