opentelemetry-go-instrumentation Instrument `RecordError` from default OTel default global impl

Originally posted by @MrAlias in https://github.com/open-telemetry/opentelemetry-go-instrumentation/pull/523#discussion_r1408318459

Dec 07 '23 16:12 MrAlias

There are going to be a few complex issues to address in this issue:

The first argument passed is a Go error interface
The method accepts a variadic argument of trace.EventOptions
This needs to produce a span event

cc @RonFed any thoughts on how to solve/address some of these?

The first argument passed is a Go `error` interface

Similar to the default Go SDK implementation we need to convert both the error type and content into semantic conventions:

https://github.com/open-telemetry/opentelemetry-go/blob/e8c22e6e7180056cbb229a474c8ee98c7696ce07/sdk/trace/span.go#L457-L460

The error type information should be interpret-able from the structure of the Go interface type itself if it has a name (likely min version of Go needed to support this). This intuition comes from the fact that the reflect package can determine this information.

It is not obvious how to get the error string representation can be determined from within the eBPF scope. To satisfy the interface, the implementation needs to have some Error method, but I'm not sure it is possible to call that from the eBPF space. Not sure how to solve this(?).

The method accepts a variadic argument of `trace.EventOptions`

This builds on issue identified above: how do we apply the trace.EventOptions to a ``trace.EventConfig` in eBPF space to understand the configuration required for the event? For example, how do we do this in eBPF space:

https://github.com/open-telemetry/opentelemetry-go/blob/e8c22e6e7180056cbb229a474c8ee98c7696ce07/sdk/trace/span.go#L462

This needs to produce a span event

This means the resolution of this issue is going to mean solving most, if not all, of the event pipeline needed to resolve #541.

Jul 10 '24 23:07 MrAlias

@MrAlias Those are great points.

It is not obvious how to get the error string representation can be determined from within the eBPF scope. To satisfy the interface, the implementation needs to have some Error method, but I'm not sure it is possible to call that from the eBPF space. Not sure how to solve this(?).

There are 2 approaches to get the error string:

The first and the simpler one is reading the string from the error concrete type. When calling errors.New or fmt.Errorf the concrete structs contain the error message as the first field in the struct. This probably covers a significant percentage of the cases but it is not a 100% solution since it won't support custom errors.
Identifying the interface is possible by reading the itab pointer of the interface: https://github.com/golang/go/blob/dfaaa91f0537f806a02ff2dd71b79844cd16cc4e/src/runtime/runtime2.go#L205 Identifying what is the concrete type of the interface is very useful in many contexts in our instrumentation, but in the error context, it still won't completely help us since the error struct can be an arbitrary one.

This builds on issue identified above: how do we apply the trace.EventOptions to a ``trace.EventConfig` in eBPF space to understand the configuration required for the event? For example, how do we do this in eBPF space:

This also comes down to identifying for each passed option what is the concrete type. Assuming we know the concrete type for each option, we can use offsetgen to read the options and apply them to the span created in eBPF.

I think adding the capability to identify interfaces is achievable and will unlock cool functionality.

Jul 11 '24 08:07 RonFed

When calling errors.New or fmt.Errorf the concrete structs contain the error message as the first field in the struct. This probably covers a significant percentage of the cases but it is not a 100% solution since it won't support custom errors.

I worry adding a partial solution like this may be more harm than help. Partially support may lead users into a false sense of security that this feature is supported, but they end up with missing data in the end. Given errors are a critical part of monitoring service health, this could be a pretty severe issue.

I'm not aware of a better way to handle it though. Which makes we wonder if we should just document that this functionality is not supported instead.

@open-telemetry/go-instrumentation-approvers thoughts?

Jul 11 '24 20:07 MrAlias

opentelemetry-go-instrumentation opentelemetry-go-instrumentation copied to clipboard

Instrument `RecordError` from default OTel default global impl

The first argument passed is a Go error interface

The method accepts a variadic argument of trace.EventOptions

This needs to produce a span event

opentelemetry-go-instrumentation
opentelemetry-go-instrumentation copied to clipboard

The first argument passed is a Go `error` interface

The method accepts a variadic argument of `trace.EventOptions`