spanner: panic in recordGFELatencyMetricsOT if EnableOpenTelemetryMetrics called after NewClient
Client
Spanner
Environment
e.g. Alpine Docker on GKE
Go Environment
go1.22.0
Code
When enabling open telemetry after creating the client the configuration set on the client is invalid resulting in potential panic.
e.g.
package main
func main() {
// ...
client, err := gcspanner.NewClient(
ctx,
db,
opts...,
)
if err != nil {
panic(err)
}
initOnce.Do(func() {
if os.Getenv("SPANNER_EMULATOR_HOST") == "" {
spanner.EnableOpenTelemetryMetrics()
}
})
// ...
}
In the above example the new client created would have a default openTelemetryConfig with all instruments being nil possibly leading to a panic during a call to recordGFELatencyMetricsOT (see bellow with added comments)
func recordGFELatencyMetricsOT(ctx context.Context, md metadata.MD, keyMethod string, otConfig *openTelemetryConfig) error {
if !IsOpenTelemetryMetricsEnabled() || md == nil && otConfig == nil { // the global flag is enable thus we don't return nil
return nil
}
attr := otConfig.attributeMap
if len(md.Get("server-timing")) == 0 && otConfig.gfeHeaderMissingCount != nil { // at this point otConfig.gfeHeaderMissingCount is nil
otConfig.gfeHeaderMissingCount.Add(ctx, 1, metric.WithAttributes(attr...))
return nil
}
serverTiming := md.Get("server-timing")[0] // This will panic !
gfeLatency, err := strconv.Atoi(strings.TrimPrefix(serverTiming, "gfet4t7; dur="))
if !strings.HasPrefix(serverTiming, "gfet4t7; dur=") || err != nil {
return err
}
attr = append(attr, attributeKeyMethod.String(keyMethod))
if otConfig.gfeLatency != nil {
otConfig.gfeLatency.Record(ctx, int64(gfeLatency), metric.WithAttributes(attr...))
}
return nil
}
Expected behavior
- The EnableOpenTelemetryMetrics function should mention that it needs to be registered prior to enabling the client
- recordGFELatencyMetricsOT should have appropriate checks to handle reading from metadata while avoiding panics
Actual behavior
- EnableOpenTelemetryMetrics doens't mention that it should be called prior to instantiating a new client
- recordGFELatencyMetricsOT panics in some specific conditions
Additional context
e.g. Started after upgrading to v1.57.0 and using the EnableOpenTelemetryMetrics function as shown above
Thanks @henripqt Its known and expected that sometimes response metadata header is missing, and below lines will fail if that's the case
if len(md.Get("server-timing")) == 0 && otConfig.gfeHeaderMissingCount != nil { // at this point otConfig.gfeHeaderMissingCount is nil
otConfig.gfeHeaderMissingCount.Add(ctx, 1, metric.WithAttributes(attr...))
return nil
}
I have raised a PR which will initialize all the configs related to open telemetry by default irrespective of whether the Opentelemetry is enabled or not that will take care of this scenario
Thanks