opentelemetry-go-instrumentation icon indicating copy to clipboard operation
opentelemetry-go-instrumentation copied to clipboard

PoC: Custom SDK

Open MrAlias opened this issue 1 year ago • 1 comments
trafficstars

PoC for #954

This is a proof-of-concept for an SDK fully implemented by the auto-instrumentation. This supports all span functionality:

  • Sampling (TODO: the sample method needs to be instrumented)
  • Random correct ID generation
  • All Start options
    • WithLinks
    • WithNewRoot
    • WithSpanKind (defaults to probe SpanKind if not set)
    • WithTimestamp
    • WithAttributes
  • The AddEvent method, including all options
    • WithStacktrace
    • WithAttributes
    • WithTimestamp
  • The AddLink method
  • The IsRecording method (TODO: based on sampling support)
  • The SpanContext method
  • The SetStatus method
  • The SetAttribute method
  • The TracerProvider method
  • All End options
    • WithTimestamp

Design

auto.GetTracerProvider

There is only one function exported publicly. This is GetTracerProvider in go.opentelemetry.io/auto.

This function returns a singleton instance of an opentelemetry-go trace.TracerProvider that is held in the internal/sdk package.

internal/sdk

The go.opentelemetry.io/auto/internal/sdk package is added. This is a "full feature" OTel trace SDK from the perspective of the Tracer and Span.

All data about any Span created will be built in userspace. This is stored (mostly) in the collector's ptrace.Traces type.

When the Span is ended the ptrace.Traces is marshaled into a proto binary encoding and passed as a buffer to the ended method of the Span. This method does nothing and is expecting a uprobe to be inserted at its call site.

auto/sdk probe

A simple probe is added to instrument the go.opentelemetry.io/auto/internal/sdk package. This probe does not rely on any offsets from the sdk types and simply routes the encoded span data from ended to the events eBPF map.

From there the ptrace.Traces data is unmarshaled and parsed into a SpanEvent that the Controller processes in the normal fashion.

Demo

Run Jaeger

$ docker run --rm --name jaeger -e COLLECTOR_OTLP_ENABLED=true -p 16686:16686 -p 4318:4318 jaegertracing/all-in-one:latest
2024/08/28 20:38:34 maxprocs: Leaving GOMAXPROCS=8: CPU quota undefined
# ...

Run the example

$ cd examples/auto-sdk && go build -o $GOPATH/bin/example && $GOPATH/bin/example
outter-0...done
outter-1...

Run the auto-instrumentation

$ cd cli && go build
$ OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 \
  OTEL_GO_AUTO_TARGET_EXE=$GOPATH/bin/example \
  OTEL_SERVICE_NAME=example \
  sudo -E ./cli
{"level":"info","ts":1724885322.1967607,"logger":"go.opentelemetry.io/auto","caller":"cli/main.go:86","msg":"building OpenTelemetry Go instrumentation ...","globalImpl":false}
# ...
{"level":"info","ts":1724885324.8517134,"logger":"go.opentelemetry.io/auto","caller":"cli/main.go:115","msg":"instrumentation loaded successfully"}

You can also run with debug logging:

$ OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 \
  OTEL_GO_AUTO_TARGET_EXE=$GOPATH/bin/example \
  OTEL_SERVICE_NAME=example \
  sudo -E ./cli -log-level=debug

Let this run for a bit and then stop the example. Stopping the example while there is a span active means you will get an error. E.g.

go build -o $GOPATH/bin/example && $GOPATH/bin/example
outter-0...done
outter-1...^Cdone

(notice the ^C is before the second done)

Review the span

Overview

20240903_091218

Spans with recorded errors (via events)

20240903_091307

Span links

20240828_155637

Open Issues/Questions

  • [ ] A maximum span serialization size of 412 is only supported
    • Ways to increase eBPF storage past the stack limit (512) need to be investigated
    • When we know the span is going to be too big, we need to drop attributes, links, and events in userspace
  • [x] Sampling needs to be implemented.
  • [ ] Fix call to bpf_probe_read: https://github.com/open-telemetry/opentelemetry-go-instrumentation/actions/runs/10605550069/job/29394558469?pr=1045
  • [x] Currently the SpanEvent start and end times are relative offsets to the eBPF process time. This is changed in this PR, thereby breaking all other probes.
  • [ ] Do we want to use ptrace from the collector as the serialization format? Do we want to build our own?
  • [ ] This adds more uses of the bpf_probe_write_user. Can we use pinned eBPF maps to bypass this and communicate across processes?

MrAlias avatar Aug 27 '24 16:08 MrAlias