opentelemetry-go icon indicating copy to clipboard operation
opentelemetry-go copied to clipboard

otlptracehttp: Client that accepts raw PB data

Open klauspost opened this issue 8 months ago • 8 comments
trafficstars

Problem Statement

I am working on an OTLP proxy that forwards traces. It will pull data from several servers and forward it to an OTLP endpoint.

Since I am already sending PB serialized data over the wire, I would like a way to utilize all the great work you have put into making the upload work smoothly. However, I would need to Unmarshal the PB data before sending it to the client for this to work. This is of course quite inefficient compared to just sending the data.

Proposed Solution

Add wrapper to client, that will upload the raw data.

The code above could be split in to two parts that does the marshalling and sending like this...

func (d *client) UploadTraces(ctx context.Context, protoSpans []*tracepb.ResourceSpans) error {
	pbRequest := &coltracepb.ExportTraceServiceRequest{
		ResourceSpans: protoSpans,
	}
	rawRequest, err := proto.Marshal(pbRequest)
	if err != nil {
		return err
	}

	return d.UploadTracesRaw(ctx, rawRequest)
}

func (d *client) UploadTracesRaw(ctx context.Context, rawRequest []byte) error {
	ctx, cancel := d.contextWithStop(ctx)
...

This would allow me to just assert an interface against the exporter and use that instead of having to un-marshal the data before sending.

I have not yet looked at the GRPC functions, but I assume it is the same case where. But if only HTTP could do this, I would be happy.

A secondary bonus is also that it would allow me to re-use the marshal buffer to avoid a big amount of allocs.

This should make no difference to the existing API/functionality.

Alternatives

It could be made part of the Exporter interface, but that seems a step to long to take.

For me the alternative would be to fork/copy over the http client code, which seems rather excessive.

Prior Art

Not sure, but it seems like a pretty nice thing to expose.

klauspost avatar Mar 19 '25 09:03 klauspost

My first feeling here is that this is not what the OTLP exporter was built for, and extending it to support this use case isn't a good idea.

Have you looked at the collector, which does exactly that (with parsing and reencoding of the data in the middle so it can be modified)?

dmathieu avatar Mar 19 '25 10:03 dmathieu

@dmathieu I am really mostly interested in straight up forwarding, and having an HTTP (and possibly GRPC) client that knows how to deal with responses, retries, etc that is expected from OTLP remotes. As I wrote I can copy/fork all of the http client, but it seems like an unintrusive change to allow for this.

I can easily decode the PB data, but it seems very wasteful when I really only want to send the data unmodified.

klauspost avatar Mar 19 '25 10:03 klauspost

For additional context:

I am adding functionality to minio. Other than uploading directly to OTLP servers we also want to support functionality where the servers are airgapped or the user otherwise aren't able to set up servers reachable from the minio servers.

So we allow the servers to aggregate traces as a stream that can then be either saved to disk for later replay or directly forwarded to an OTLP server when received. It seemed very natural to send the traces with PB encoding, that way I "only" need to send the traces.

klauspost avatar Mar 19 '25 12:03 klauspost

@klauspost, this needs to be a proposal in https://github.com/open-telemetry/opentelemetry-specification

Check if there are other similar issues, because I think the issue you are trying to solve is quite common. Here is an example: https://github.com/open-telemetry/opentelemetry-specification/issues/3645

pellared avatar Mar 19 '25 15:03 pellared

The request is for the client which is not a concept of the specification. I don't think this is a specification issue.

@klauspost given the client is unexported, your solution would require a otlptrace.Client to be cast to a locally defined interface with the expected form proposed here?

This seems reasonable to me at a high level.

Are you only interested in traces? There is a different design in other signals that would not support this.

MrAlias avatar Mar 19 '25 15:03 MrAlias

@MrAlias Even if it is on an unexported type, I would be able to check it against an interface:

type RawUploader {
	UploadTracesRaw(context.Context, []byte) error
}

I personally mind this being a privately held interface, but it could of course be an official optional interface. I don't have any opinion on that.

I am not currently looking at metrics or logs - but eventually we would likely want to add it as well. I don't know if this applies there, but I guess it does.

klauspost avatar Mar 19 '25 15:03 klauspost

If we do this, the interface between each signal should be similar. So we should start by cleaning up those differences.

dmathieu avatar Mar 19 '25 15:03 dmathieu

@pellared @dmathieu I honestly don't know enough about your nice packages to add a proposal that would be reasonably consistent with the rest of your API.

I mainly looked for a "hook" into the http client code. I am comfortable enough with this not being official API, as I realize sending []byte slices requires a certain amount of trust, which you may not want to use as the official API.

klauspost avatar Mar 19 '25 15:03 klauspost