[BUG]: Error stacktrace and Exception type not being set properly
Tracer Version(s)
1.70.0
Go Version(s)
go1.22.0
Bug Report
func recordErrorFromCtx(ctx context.Context, err error, markFailure bool, attrs ...attribute.KeyValue) {
if err == nil {
return
}
span := trace.SpanFromContext(ctx)
if markFailure {
span.SetStatus(codes.Error, err.Error())
}
span.RecordError(err, trace.WithStackTrace(true), trace.WithAttributes(attrs...))
}
But it is not adding any attributes to that span and neither am I getting proper stacktrace And by seeing the stacktrace that I am getting it seems like the it is overwriting the actual stacktrace with the otel's internal middlewares and interceptors and I am getting the stacktrace of the last executed function for each and every error
Reproduction Code
No response
Error Logs
No response
Go Env Output
No response
Can someone please guide me on this so I can debug this issue further and try to fix it It gets difficult to debug errors and panics without the proper and relevant stacktrace
Thanks!!
Hi @agentCalculator ,
Thanks for reaching out. It seems we may have overlooked the RecordError API when adding Opentelemetry drop-in support. I'm currently looking into alternatives and will update you once I have more information.
Hi @agentCalculator ,
Thanks for reaching out. It seems we may have overlooked the RecordError API when adding Opentelemetry drop-in support. I'm currently looking into alternatives and will update you once I have more information.
Hi @mtoffl01
The stacktrace is recorded within the SDK, and we get the stacktrace from the runtime. So it seems like we are getting this meaningless stacktrace because the sdk and the application are not in the same stack
Please let me know what alternatives are you looking into, I am also trying to debug and find some solution regarding this
Hey @agentCalculator ,
The stacktrace is recorded within the SDK, and we get the stacktrace from the runtime. So it seems like we are getting this meaningless stacktrace because the sdk and the application are not in the same stack
Just to make sure I'm following, we've identified two problems here, correct?
- The stack trace on a [grpc server?] span is not useful nor is it the stacktrace you'd expect
- Attempts to overwrite this meaningless stacktrace with a custom one via the RecordException API have no effect
Please confirm. Additionally, clarify the kind of error you expect to see -- I assume from your grpc server.
In the meantime, you can try the following to customize the error on the span, instead of the RecordException API:
Pass the span into the EndOptions function along with the tracer.WithError FinishOption, with your error. You can check out the example here -
ddotel.EndOptions(child, ddtracer.WithError(err)), where child just refers to the span.
Yes, both of the points mentioned by you are correct
Error messages are being recorded correctly but the type being assigned is errors.errorString for each
until then will try finish and endoptions functions but still I guess the stacktrace issue will persist as the runtime stack of the SDK and my service are different and stack framer are recorded runtime
@agentCalculator It seems the ddotel package is creating a new error using trace.WithError() inside span.End(), which uses the status description provided by the span.SetStatus with golang std errors package when otel status code is set to error, In your case status description is err.Error(), which explains why error.message is correct and why type is always being assigned to errors.errorString.
https://github.com/DataDog/dd-trace-go/blob/fe9272dcb82745b2fd352f7bb54dd0db4d96d1c1/ddtrace/opentelemetry/span.go#L85-L88
https://github.com/DataDog/dd-trace-go/blob/fe9272dcb82745b2fd352f7bb54dd0db4d96d1c1/ddtrace/tracer/span.go#L424-L432
I have explained the issue in detail here, and have purposed a solution.
Considering the context propogation issue between library/packages and application we need to record the runtime stack trace frames in our application itself in order to get the relevant stacktrace of the actual error site instead of middlewares or library stacktrace
For now until the issue is actually fixed we are setting a custom error type, fingerprint and stack from our application itself by re-wrapping the error
Please let us know if anyone have any other good and optimal approach to this