opentelemetry-go icon indicating copy to clipboard operation
opentelemetry-go copied to clipboard

otlp*http modules import grpc

Open MadVikingGod opened this issue 3 years ago • 48 comments
trafficstars

Description

This was surfaced in the discussion:https://github.com/open-telemetry/opentelemetry-go/discussions/2509

The otlptracehttp client and otlptrace depend on grpc. This is a transitive dependency from internal/otlpconfig. This is counter to why we split these packages into separate mods in the first place.

Environment

  • OS: [e.g. iOS]
  • Architecture: [e.g. x86, i386]
  • Go Version: [e.g. 1.15]
  • opentelemetry-go version: [e.g. v0.14.0, 3c7face]

Steps To Reproduce

  1. cd exporters/otlp/otlptrace/otlptracehttp
  2. go mod graph | grep grpc this should be empty
  3. go mod why google.golang.org/grpc this should report that module does not need grpc.

Expected behavior

The HTTP client should not depend on grpc.

Aditional information

otlpconfig has a config struct that is shared between HTTP client and grpc. This has logic around extracting the variables from the environment. Because the config has settings for grpc this means importing otlpconfig will import grpc (even if they are later pruned).

MadVikingGod avatar Feb 03 '22 15:02 MadVikingGod

I dug into this and I don't think we can avoid this.

This is because the opentelemetry-proto-go only defines the services and the request message with the grpc service. So for us to make HTTP separate from GRPC the service request message would need to be moved.

MadVikingGod avatar Feb 15 '22 16:02 MadVikingGod

I dug into this and I don't think we can avoid this.

Should this issue be closed?

MrAlias avatar Feb 15 '22 16:02 MrAlias

I would close this if either we don't think we will ever change how the grpc is generated, or if we decide that this isn't a goal.

If the former, we should put an issue into proto-go to address this. For the latter, I think we should make an issue to reduce the mods we have in the otlptrace directory. If you will have the grpc imported no matter what I don't see the need to break HTTP vs grpc.

MadVikingGod avatar Feb 15 '22 19:02 MadVikingGod

I will move what I have written on the Slack here. Thanks to Damien who has informed me about this.

grpc/protobuf dependency is bringing in lots of packages and increasing the size tremendously for a simple hello world application. Due to this, SDK becomes unreasonable for constraint environments.

I have a helloworld application instrumented with the API layer. When I link it with the SDK layer, the size is going off the charts. I compiled the application with go build -ldflags="-s -w" The SDK is using console exporter, otlphttp exporter, resources, semconv.

2.9M Feb 24 07:44 main.api
11M Feb 24 07:45 main.sdk

It is almost 8MB difference. I profiled both applications with goweight and the top list is for grpc and protobuf stuff.

5.8 MB [google.golang.org/protobuf/internal/impl](http://google.golang.org/protobuf/internal/impl)
4.0 MB [golang.org/x/net/http2](http://golang.org/x/net/http2)
3.2 MB [google.golang.org/grpc](http://google.golang.org/grpc)
2.4 MB [google.golang.org/grpc/internal/transport](http://google.golang.org/grpc/internal/transport)
2.3 MB [google.golang.org/protobuf/internal/filedesc](http://google.golang.org/protobuf/internal/filedesc)
1.9 MB [golang.org/x/sys/unix](http://golang.org/x/sys/unix)
1.5 MB [github.com/grpc-ecosystem/grpc-gateway/v2/runtime](http://github.com/grpc-ecosystem/grpc-gateway/v2/runtime)
1.4 MB text/template/parse
1.4 MB text/template
1.3 MB [go.opentelemetry.io/otel/sdk/trace](http://go.opentelemetry.io/otel/sdk/trace)
1.3 MB [google.golang.org/protobuf/reflect/protoreflect](http://google.golang.org/protobuf/reflect/protoreflect)
1.3 MB [google.golang.org/protobuf/types/descriptorpb](http://google.golang.org/protobuf/types/descriptorpb)
1.2 MB encoding/xml
1.2 MB [github.com/golang/protobuf/proto](http://github.com/golang/protobuf/proto)
1.1 MB html/template
1.1 MB [google.golang.org/protobuf/encoding/protojson](http://google.golang.org/protobuf/encoding/protojson)
994 kB [google.golang.org/protobuf/reflect/protodesc](http://google.golang.org/protobuf/reflect/protodesc)
945 kB [golang.org/x/text/unicode/norm](http://golang.org/x/text/unicode/norm)

I don’t understand why grpc/protobuf dependencies need to be linked in. Am I missing a configuration or is there a bug in bringing in dependencies. Let me know if you want to see the helloworld application.

Exporters are

"go.opentelemetry.io/otel/exporters/otlp/otlptrace"
"go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp"
"go.opentelemetry.io/otel/exporters/stdout/stdouttrace"

utezduyar avatar Feb 24 '23 09:02 utezduyar

@utezduyar protobuf dependencies are needed as otlptracehttp is basically profobuf over HTTP. The issue is that gRPC dependencies should not be present.

EDIT: TBH I am shocked that go build does not get rid of the gRPC code 😨

pellared avatar Feb 24 '23 10:02 pellared

@pellared I am a bit confused. I WireSharked my application and it posted the trace data as JSON to /v1/traces of my local collector. Where is the protobuf then?

utezduyar avatar Feb 24 '23 10:02 utezduyar

@utezduyar The payload send via HTTP is serialized using protobuf .

pellared avatar Feb 24 '23 10:02 pellared

I figured it out. It is the resource information that is posted as json but you are right, I saw the posts as application/x-protobuf on WireShark. Thanks for clarifying.

utezduyar avatar Feb 24 '23 10:02 utezduyar

@MadVikingGod Are your findings from 1 year ago still valid? If so, is it a complex task to move the service request message?

utezduyar avatar Mar 01 '23 21:03 utezduyar

@MadVikingGod Are your findings from 1 year ago still valid? If so, is it a complex tax to move the service request message?

Moving the service request is almost certainly not something the proto team want to do, but ultimately it is question for them.

MrAlias avatar Mar 01 '23 22:03 MrAlias

I am not really understanding the problem or the discussion of moving or not due to my lack of decent understanding of the internals (yet at least). Can service request message not be duplicated instead of moving? If the service request message is requiring grpc, how is HTTP 1.1 trace export going to work? I checked C++ SDK and they don't have this dependency. What is really different with go?

utezduyar avatar Mar 01 '23 22:03 utezduyar

@MrAlias I looked at it a bit more and I think I understand it better. I still think my questions are valid regarding for example how C++ can handle it but not go. Anyways. I am trying to figure out if this is not something it can be solved in the https://github.com/open-telemetry/opentelemetry-proto-go by having 2 generated code for the trace service, one with GRPC one without, as separate modules. Opinion?

utezduyar avatar Mar 04 '23 16:03 utezduyar

@MadVikingGod (CC: @paivagustavo as the author of 2371bb0a0)

I investigated this more and I don't believe it is due to the service message request. I think this is something we can solve it in the opentelemetry-go repo.

otlptracehttp exporter is not using the grpc. https://github.com/open-telemetry/opentelemetry-go/blob/v1.14.0/exporters/otlp/otlptrace/otlptracehttp/client.go#L129 It uses the data type coltracepb.ExportTraceServiceRequest but that is it. The data type is defined in https://github.com/open-telemetry/opentelemetry-proto-go/blob/v0.19.0/otlp/collector/trace/v1/trace_service.pb.go which is also not using grpc.

I don't know if go mod why is listing all the dependencies but what is written here is a culprit.

go mod why google.golang.org/grpc
...
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp
go.opentelemetry.io/otel/exporters/otlp/otlptrace/internal/otlpconfig
google.golang.org/grpc

The config data types internal/otlpconfig is bundling GRPC and HTTP data fields together. Both NewHTTPConfig and NewGRPCConfig are returning Config data type which is this:

Config struct {
		// Signal specific configurations
		Traces SignalConfig
		RetryConfig retry.Config
		// gRPC configurations
		ReconnectionPeriod time.Duration
		ServiceConfig      string
		DialOptions        []grpc.DialOption
		GRPCConn           *grpc.ClientConn
	}

utezduyar avatar Mar 06 '23 17:03 utezduyar

go mod why will show a shortest path in the import graph from the module, it does not show all.

There have been prior attempts to separate otlpconfig to address this. Ultimately, it will not address the issue because go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp imports go.opentelemetry.io/proto/otlp and go.opentelemetry.io/proto/otlp imports google.golang.org/grpc. There still exists an indirect dependency on go.opentelemetry.io/proto/otlp after otlpconfig was split.

MrAlias avatar Mar 06 '23 17:03 MrAlias

grpc dependency is in otlp/collector/trace/v1/trace_service.pb.gw.go and otlp/collector/trace/v1/trace_service_grpc.pb.go but not in otlp/collector/trace/v1/trace_service.pb.go which is used by otlptracehttp. If the go compiler cannot figure out that it should not bring in the grpc dependency on an unused code (probably because of interface related decision) then I would say maybe https://github.com/open-telemetry/opentelemetry-proto-go should be split to have generated files to different modules. At least I believe it is more possible than convincing the proto team.

utezduyar avatar Mar 06 '23 17:03 utezduyar

I wanted to see what more things are dependent on grpc and experimented. otlpconfig and proto modules as we discussed before are the only dependencies.

otlpconfig split: go.opentelemetry.io/otel/exporters/otlp/otlptrace/internal/otlpconfig/options.go: Removed all the mentions of grpc (for the sake of moving foward).

proto files: go.opentelemetry.io/proto/otlp/otlp/collector/trace/v1: I deleted both the service and the gateway files (trace_service.pb.gw.go and trace_service_grpc.pb.go).

This way, grpc dependency was dropped and my binary size was down 20%.

I would like to discuss the possibility of splitting service and gateway files in a separate module to get around go compiler downside. Some contributors in this issue are also committers in opentelemetry-proto-go. What is the best way to discuss this topic with opentelemetry-proto-go?

utezduyar avatar Mar 07 '23 19:03 utezduyar

I would like to discuss the possibility of splitting service and gateway files in a separate module to get around go compiler downside. Some contributors in this issue are also committers in opentelemetry-proto-go. What is the best way to discuss this topic with opentelemetry-proto-go?

Can you provide a proposal for how go.opentelemetry.io/proto/otlp will be changed and how the opentelemetry-proto-go will be updated to support the changes?

MrAlias avatar Mar 08 '23 01:03 MrAlias

I think the protobuf definition is already good enough and the code generator is doing it's best at splitting code based on different use cases. Therefore I do NOT believe we need a change from the go.opentelemetry.io/proto/otlp repo.

The collector/ under the opentelemetry-proto-go is having 3 logical go file separation for 3 telemetry types. One file for the data type, one file for the grpc client and one file for the grpc server.

My propose is to have grpc client and grpc server into it's own module in the opentelemetry-proto-go repo.

Again, if the go linker could have been able to figure out that it does not need to link the grpc libraries because the code is not used, we wouldn't need to break the generated go files.

utezduyar avatar Mar 08 '23 09:03 utezduyar

The collector/ under the opentelemetry-proto-go is having 3 logical go file separation for 3 telemetry types. One file for the data type, one file for the grpc client and one file for the grpc server.

My propose is to have grpc client and grpc server into it's own module in the opentelemetry-proto-go repo.

Your proposal is to duplicate the code in go.opentelemetry.io/proto/otlp into new modules? This is not a clear solution to me.

MrAlias avatar Mar 08 '23 16:03 MrAlias

Not duplicate but split the files in the collector/. https://github.com/open-telemetry/opentelemetry-proto-go/tree/main/otlp/collector/trace/v1 https://github.com/open-telemetry/opentelemetry-proto-go/tree/main/otlp/collector/metrics/v1 and https://github.com/open-telemetry/opentelemetry-proto-go/tree/main/otlp/collector/logs/v1.

Let's take the tracing as example: Module 1: https://github.com/open-telemetry/opentelemetry-proto-go/blob/main/otlp/collector/trace/v1/trace_service.pb.go Module 2: https://github.com/open-telemetry/opentelemetry-proto-go/blob/main/otlp/collector/trace/v1/trace_service.pb.gw.go and https://github.com/open-telemetry/opentelemetry-proto-go/blob/main/otlp/collector/trace/v1/trace_service_grpc.pb.go

utezduyar avatar Mar 08 '23 17:03 utezduyar

Let's take the tracing as example: Module 1: https://github.com/open-telemetry/opentelemetry-proto-go/blob/main/otlp/collector/trace/v1/trace_service.pb.go Module 2: https://github.com/open-telemetry/opentelemetry-proto-go/blob/main/otlp/collector/trace/v1/trace_service.pb.gw.go and https://github.com/open-telemetry/opentelemetry-proto-go/blob/main/otlp/collector/trace/v1/trace_service_grpc.pb.go

How do you plan to split go.opentelemetry.io/proto/otlp/collector/trace and have two packages with exactly the same package name? Modules, and packages, are not split by files, they are split by directories.

If you move something out of go.opentelemetry.io/proto/otlp/collector it will be a breaking change to the exiting package. This will be backwards incompatible.

MrAlias avatar Mar 08 '23 20:03 MrAlias

I see your point. If we cannot break backwards compatibility, then not sure what options we have left but to duplicate go.opentelemetry.io/proto/otlp into new modules. It is not going to help to make changes in the proto files since all the generated code goes into one module, go.opentelemetry.io/proto/otlp, that the linker is going to pick up grpc code, even when they are unused.

utezduyar avatar Mar 08 '23 20:03 utezduyar

Does this have a known performance impact, or is it about disk space?

dmathieu avatar Mar 09 '23 10:03 dmathieu

For my use case it is disk space. I would expect minimum to 0 impact on the performance. Maybe extra decompression if UPX is used and wasted memory when UPX is unpacked to the memory. If UPX is not used, I believe the pages of the grpc code will never be brought in to the memory if they are not called and not impact anything.

Wasting some megabytes in persistent space is probably not a problem for cloud services but I believe any megabyte saved in constraint environments is important, especially for something that is not in use.

utezduyar avatar Mar 09 '23 12:03 utezduyar

I would close this if either we don't think we will ever change how the grpc is generated, or if we decide that this isn't a goal.

If the former, we should put an issue into proto-go to address this. For the latter, I think we should make an issue to reduce the mods we have in the otlptrace directory. If you will have the grpc imported no matter what I don't see the need to break HTTP vs grpc.

Even after 1 year, I believe @MadVikingGod 's comments are still relevant (sorry, it took me a while to come to the same point). I have created an issue in the opentelemetry-proto-go repo (above).

If we would like to solve this issue, I cannot come up with a better plan than below. A) Split proto-go files in modules in a backwards incompatible way and publish both major versions. B) Work on the internal/otlpconfig to split grpc dependencies.

If we don't want to split the dependencies, then maybe we should close this ticket and open a new one to reduce the mods in the otlptrace

The last option is to mark the ticket as we will not make any change. If we pick this option, I am considering to carry a downstream patch that removes the grpc dependencies and use the patch.

utezduyar avatar Mar 14 '23 08:03 utezduyar