opentelemetry-java icon indicating copy to clipboard operation
opentelemetry-java copied to clipboard

Moving the rendering of the exception back into SdkSpan makes exception attribute rendering unavoidable and is expensive

Open jack-berg opened this issue 11 months ago • 5 comments

I wish I had noticed this PR earlier. Unfortunately moving the rendering of the exception back into SdkSpan makes it unavoidable and doing so is incredibly expensive. In our Spring WebFlux services not only are the stack traces absolutely massive, but the rendering of the traceback creates resource contention due to synchronization. It causes a noticeable impact on the performance of our services. I would like to revisit this decision and at least provide a means to make it optional.

Originally posted by @HaloFour in https://github.com/open-telemetry/opentelemetry-java/pull/6795#discussion_r1964158815

jack-berg avatar Feb 20 '25 21:02 jack-berg

Thanks for opening the issue for me.

For a little bit of context, we are using OTel with Spring WebFlux where the stacktraces can be pretty enormous. Plus Reactor automatically adds a suppressed throwable used for rendering tracebacks, which is shared and the getMessage method is synchronized as the state changes, which does cause contention. Because of this we try to avoid rendering the full stack trace as it's quite expensive.

We are using the deferred attribute calculation of ExceptionEventData in order to avoid rendering the stack trace. We have our own custom implementation of SpanExporter which wraps other exporters and intercepts the SpanData. If that SpanData has any ExceptionEventData events it custom renders the captured Throwable using a log4j PatternFormatter to limit the size of rendered exception, which we then export as the exception.stack_trace attribute.

Ideally I'd like to have some control over how that attribute ends up being rendered and to avoid the default rendering. I understand that the deferred calculation didn't play well with SpanLimits and I'm certainly open to exploring other ways to accomplish this. :)

HaloFour avatar Feb 20 '25 23:02 HaloFour

Ideally I'd like to have some control over how that attribute ends up being rendered and to avoid the default rendering.

What do folks (including @open-telemetry/java-approvers) think about adding a SdkTracerProviderBuilder option for customizing the rendering of exceptions?

Could have an interface like:

public interface ExceptionRenderer {

  /** Render the {@code exception.type} attribute value. */
  default String getExceptionType(Throwable throwable) {
    return throwable.getClass().getCanonicalName();
  }

  /** Render the {@code exception.message} attribute value. */
  default String getExceptionMessage(Throwable throwable) {
    return throwable.getMessage();
  }

  /** Render the {@code exception.stacktrace} attribute value. */
  default String getExceptionStacktrace(Throwable throwable) {
    StringWriter stringWriter = new StringWriter();
    try (PrintWriter printWriter = new PrintWriter(stringWriter)) {
      throwable.printStackTrace(printWriter);
    }
    return stringWriter.toString();
  }
}

And a method to customize it where you can selectively override:

SdkTracerProvider.builder()
  .setExceptionRenderer(new ExceptionRenderer() {
     @Override
     String getExceptionStacktrace(Throwable throwable) {
        // ...
      }
  }
   });

The renderer would be used in SdkSpan in place of the hardcoded rendering logic.

The resulting attributes would still be subject to attribute limits.

jack-berg avatar Feb 21 '25 16:02 jack-berg

On the other hand, I can't find a way to actually produce a truncated stack trace besides vendoring in the logic from Throwable#printStackTrace(), so not how useful it would be to give the ability override ExceptionRenderer, when all you can do in practice is skip rendering an attribute like exception.stacktrace altogether.

jack-berg avatar Feb 21 '25 16:02 jack-berg

I was starting down a similar rabbit hole. My facade is a little different, but the path would be able the same:

@FunctionalInterface
public interface SpanExceptionRenderer {
  AttributeKey<String> EXCEPTION_TYPE =
      AttributeKey.stringKey("exception.type");
  AttributeKey<String> EXCEPTION_MESSAGE =
      AttributeKey.stringKey("exception.message");
  AttributeKey<String> EXCEPTION_STACKTRACE =
      AttributeKey.stringKey("exception.stacktrace");

  Attributes renderAttributes(Throwable exception, Attributes additionalAttributes, SpanLimits spanLimits);

  static SpanExceptionRenderer getDefault() {
    return DefaultSpanExceptionRenderer.INSTANCE;
  }
}

HaloFour avatar Feb 21 '25 16:02 HaloFour

On the other hand, I can't find a way to actually produce a truncated stack trace besides vendoring in the logic from Throwable#printStackTrace(), so not how useful it would be to give the ability override ExceptionRenderer, when all you can do in practice is skip rendering an attribute like exception.stacktrace altogether.

This is what we're doing today. We avoid the calculated attributes in ExceptionEventData entirely and then render the Exception ourselves, using a custom pattern API based on log4j2's ThrowableAttributeConverter API.

I understand that our use case is an outlier and I imagine that this proposed solution is quite a bit overkill. Honestly, I'd be happy if only we had a switch to disable the SDK from emitting the exception.stacktrace attribute entirely, that way I can do so myself from the Throwable instance.

HaloFour avatar Feb 21 '25 17:02 HaloFour