opentelemetry-go-instrumentation icon indicating copy to clipboard operation
opentelemetry-go-instrumentation copied to clipboard

The SDK should not import `pdata`

Open MrAlias opened this issue 1 year ago • 9 comments

Currently, the SDK module imports pdata, which implicilty imports large modules (i.e. github.com/gogo/protobuf, google.golang.org/grpc, google.golang.org/protobuf, gopkg.in/yaml.v3):

https://github.com/open-telemetry/opentelemetry-go-instrumentation/blob/d0db32bb2bd891cda7a83221dc3f9645158256e2/sdk/go.mod#L7-L30

This SDK is going to be imported into the global OTel Go API. Meaning it needs to be as light as possible. Having these imports is not going to be viable.

A new transport model needs to be explored. Possibly, this could relate to a replacement of the Event type as a project data-model.

Part of #954

MrAlias avatar Sep 16 '24 19:09 MrAlias

It is not possible to use a protobuf definition without included a protobuf library as a dependency. Therefore, it does not seem viable to use any protobuf representation is the long-term

MrAlias avatar Sep 16 '24 19:09 MrAlias

There are two possibilities I am currently looking into:

  • Using encoding/gob
  • Using encoding/json with an OTLP encoding

MrAlias avatar Sep 16 '24 19:09 MrAlias

Comparison of encoding/gob vs encoding/json for serialization of SDK data-model:

$ go test -run="^$" -bench=.
goos: linux
goarch: amd64
pkg: go.opentelemetry.io/auto/internal/pkg/data
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
BenchmarkGOBMarshalUnmarshal-8    	    4609	    266378 ns/op	   35710 B/op	     778 allocs/op
BenchmarkJSONMarshalUnmarshal-8   	   41809	     26202 ns/op	    1793 B/op	      20 allocs/op
PASS

I'm not entirely sure that I am encoding things correctly with gob, but it appears to be significantly less performant.

This also, doesn't represent the issue I had in trying to encode attributes (i.e. AnyValue). I think the data-model would need to be updated so the Value field has a pointer to the interface:

type AnyValue struct {
	Value *isAnyValue_Value
}

Or maybe there is also some equivalent to UnmarshalJSON.

I don't plan to continue evaluation of gob, though, given the performance difference.

MrAlias avatar Sep 18 '24 14:09 MrAlias

It was also pointed out by @pellared that the encoding/gob use would have compatibility issues. If the sender/receiver used different versions of the data-model they would not be compatible.

MrAlias avatar Sep 19 '24 16:09 MrAlias

Is this blocking PRs like #1110, #1111 and #1112?

RonFed avatar Sep 21 '24 08:09 RonFed

No, it shouldn't be. The alternate approach being worked on supports those features just as well as pdata does. Those PRs are independent of this.

This does block the sdk package from being released though given the global API cannot depend on it until this dependency is removed.

MrAlias avatar Sep 21 '24 14:09 MrAlias

Those PRs are using pdata-specific functions, right? Wouldn't that mean we will need to re-do them?

RonFed avatar Sep 21 '24 14:09 RonFed

Yes, they are using pdata functions, and they will need to be updated when alternate data-model is added. However, the overall functionality and testing should remain similar, if not the same, with a replacement of the data-model.

I don't think we need to block those PRs as they can be updated when the alternate is done. Otherwise, progress on the SDK integration testing and examples will be blocked without them.

MrAlias avatar Sep 21 '24 14:09 MrAlias

FWIW, the whole SDK (i.e. probe, start/end func, struct decl) will also need to be updated.

MrAlias avatar Sep 21 '24 14:09 MrAlias

Resolved by #1195

MrAlias avatar Oct 22 '24 14:10 MrAlias