opentelemetry-go icon indicating copy to clipboard operation
opentelemetry-go copied to clipboard

otlpmetric with StringSlice produces duplicate metrics

Open mmidzik opened this issue 3 years ago • 7 comments

Description

Metrics added with the same key and a StringSlice value are not deduplicated, resulting in multiple identical output series. If running with a prometheus exporter, this causes the exporter to fail with collected metric "metric_name" … was collected before with the same name and label values from this error.

Environment

Running in golang:1.16 image

go.opentelemetry.io/otel v1.9.0
go.opentelemetry.io/otel/exporters/stdout/stdoutmetric v0.31.0
go.opentelemetry.io/otel/sdk/metric v0.31.0

Steps To Reproduce

package example

import (
	"context"
	"fmt"
	"testing"
	"time"

	"github.com/stretchr/testify/require"
	"go.opentelemetry.io/otel/attribute"
	"go.opentelemetry.io/otel/exporters/stdout/stdoutmetric"
	controller "go.opentelemetry.io/otel/sdk/metric/controller/basic"
	processor "go.opentelemetry.io/otel/sdk/metric/processor/basic"
	"go.opentelemetry.io/otel/sdk/metric/selector/simple"
)

func TestIssue(t *testing.T) {

	exp, err := stdoutmetric.New(stdoutmetric.WithPrettyPrint())
	require.NoError(t, err)

	pusher := controller.New(
		processor.NewFactory(
			simple.NewWithHistogramDistribution(),
			exp,
		),
		controller.WithExporter(exp),
		controller.WithCollectPeriod(time.Second),
	)
	_ = pusher.Start(context.Background())

	Int64Histogram, err := pusher.Meter("test").SyncInt64().Histogram(
		fmt.Sprintf("test_histogram"),
	)
	require.NoError(t, err)

	for i := 0; i < 3; i++ {
		Int64Histogram.Record(context.Background(), 1, attribute.String("foo", "a"))
		Int64Histogram.Record(context.Background(), 1, attribute.StringSlice("bar", []string{"b"}))
	}
	time.Sleep(time.Second * 3)
}

Currently outputs:

[
	{
		"Name": "test_histogram{service.name=unknown_service:example.test,telemetry.sdk.language=go,telemetry.sdk.name=opentelemetry,telemetry.sdk.version=1.9.0,instrumentation.name=test,foo=a}",
		"Sum": 3
	},
	{
		"Name": "test_histogram{service.name=unknown_service:example.test,telemetry.sdk.language=go,telemetry.sdk.name=opentelemetry,telemetry.sdk.version=1.9.0,instrumentation.name=test,bar=[b]}",
		"Sum": 1
	},
	{
		"Name": "test_histogram{service.name=unknown_service:example.test,telemetry.sdk.language=go,telemetry.sdk.name=opentelemetry,telemetry.sdk.version=1.9.0,instrumentation.name=test,bar=[b]}",
		"Sum": 1
	},
	{
		"Name": "test_histogram{service.name=unknown_service:example.test,telemetry.sdk.language=go,telemetry.sdk.name=opentelemetry,telemetry.sdk.version=1.9.0,instrumentation.name=test,bar=[b]}",
		"Sum": 1
	}
]

Expected behavior

I expect the key-values of StringSlice to deduplicate, with the above snippet producing two metrics with a sum of 3. I expect the values of StringSlice to be deduplicated as-is, without sorting (e.g. the array stringified), as they are input as an ordered array.

[
	{
		"Name": "test_histogram{service.name=unknown_service:example.test,telemetry.sdk.language=go,telemetry.sdk.name=opentelemetry,telemetry.sdk.version=1.9.0,instrumentation.name=test,foo=a}",
		"Sum": 3
	},
	{
		"Name": "test_histogram{service.name=unknown_service:example.test,telemetry.sdk.language=go,telemetry.sdk.name=opentelemetry,telemetry.sdk.version=1.9.0,instrumentation.name=test,bar=[b]}",
		"Sum": 3
	},
]

Note: I recognize this might be intentional behavior, but it surprised me / I wasn't able to find documentation around the behavior. If this is not a bug but rather a documentation issue, I'd be happy to submit a related documentation MR.

mmidzik avatar Aug 24 '22 18:08 mmidzik

Thanks for the bug report.

The duplicate metrics are resulting from attribute sets not rendering a truly comparable Distinct value for a slice. For example:

package main

import (
	"fmt"

	"go.opentelemetry.io/otel/attribute"
)

func main() {
	a := attribute.NewSet(attribute.StringSlice("key", []string{"value"}))
	b := attribute.NewSet(attribute.StringSlice("key", []string{"value"}))
	fmt.Println(a.Equivalent() == b.Equivalent())
}

returns false even though the sets are equivalent.

This means that the map key used by the metric instrument does not retrieve existing metrics when they exist.

https://github.com/open-telemetry/opentelemetry-go/blob/569f7430726278f40e562f81e64d21156cc3edb9/sdk/metric/sdk.go#L161

This is almost certainly going to be a problem with the new SDK as well:

https://github.com/open-telemetry/opentelemetry-go/blob/29c9c40ce4eb8f876e7bbb8ef86860d4cebe85b5/sdk/metric/internal/histogram.go#L94

MrAlias avatar Sep 08 '22 16:09 MrAlias

This test passes when run against the main branch:

func TestSetComparability(t *testing.T) {
	pairs := [][2]attribute.KeyValue{
		{
			attribute.Bool("Bool", true),
			attribute.Bool("Bool", true),
		},
		{
			attribute.BoolSlice("BoolSlice", []bool{true, false, true}),
			attribute.BoolSlice("BoolSlice", []bool{true, false, true}),
		},
		{
			attribute.Int("Int", 34),
			attribute.Int("Int", 34),
		},
		{
			attribute.IntSlice("IntSlice", []int{312, 1, -2}),
			attribute.IntSlice("IntSlice", []int{312, 1, -2}),
		},
		{
			attribute.Int64("Int64", 98),
			attribute.Int64("Int64", 98),
		},
		{
			attribute.Int64Slice("Int64Slice", []int64{12, 1298, -219, 2}),
			attribute.Int64Slice("Int64Slice", []int64{12, 1298, -219, 2}),
		},
		{
			attribute.Float64("Float64", 19.09),
			attribute.Float64("Float64", 19.09),
		},
		{
			attribute.Float64Slice("Float64Slice", []float64{12398.1, -37.1713873737, 3}),
			attribute.Float64Slice("Float64Slice", []float64{12398.1, -37.1713873737, 3}),
		},
		{
			attribute.String("String", "string value"),
			attribute.String("String", "string value"),
		},
		{
			attribute.StringSlice("StringSlice", []string{"one", "two", "three"}),
			attribute.StringSlice("StringSlice", []string{"one", "two", "three"}),
		},
		{
			attribute.Stringer("Stringer", stringer("stringer value")),
			attribute.Stringer("Stringer", stringer("stringer value")),
		},
	}

	for _, p := range pairs {
		s0, s1 := attribute.NewSet(p[0]), attribute.NewSet(p[1])
		m := map[attribute.Set]struct{}{s0: {}}
		assert.Containsf(t, m, s1, "%s not comparable", p[0].Value.Type())
	}
}

I wonder if the inequality is coming from the descriptor:

https://github.com/open-telemetry/opentelemetry-go/blob/569f7430726278f40e562f81e64d21156cc3edb9/sdk/metric/sdk.go#L160

or maybe the sort slice being reused?

https://github.com/open-telemetry/opentelemetry-go/blob/569f7430726278f40e562f81e64d21156cc3edb9/sdk/metric/sdk.go#L155

More investigation is needed.

MrAlias avatar Sep 08 '22 16:09 MrAlias

When a direct map look-up is used, the above test fails:

func TestSetComparability(t *testing.T) {
	pairs := [][2]attribute.KeyValue{
		{
			attribute.Bool("Bool", true),
			attribute.Bool("Bool", true),
		},
		{
			attribute.BoolSlice("BoolSlice", []bool{true, false, true}),
			attribute.BoolSlice("BoolSlice", []bool{true, false, true}),
		},
		{
			attribute.Int("Int", 34),
			attribute.Int("Int", 34),
		},
		{
			attribute.IntSlice("IntSlice", []int{312, 1, -2}),
			attribute.IntSlice("IntSlice", []int{312, 1, -2}),
		},
		{
			attribute.Int64("Int64", 98),
			attribute.Int64("Int64", 98),
		},
		{
			attribute.Int64Slice("Int64Slice", []int64{12, 1298, -219, 2}),
			attribute.Int64Slice("Int64Slice", []int64{12, 1298, -219, 2}),
		},
		{
			attribute.Float64("Float64", 19.09),
			attribute.Float64("Float64", 19.09),
		},
		{
			attribute.Float64Slice("Float64Slice", []float64{12398.1, -37.1713873737, 3}),
			attribute.Float64Slice("Float64Slice", []float64{12398.1, -37.1713873737, 3}),
		},
		{
			attribute.String("String", "string value"),
			attribute.String("String", "string value"),
		},
		{
			attribute.StringSlice("StringSlice", []string{"one", "two", "three"}),
			attribute.StringSlice("StringSlice", []string{"one", "two", "three"}),
		},
		{
			attribute.Stringer("Stringer", stringer("stringer value")),
			attribute.Stringer("Stringer", stringer("stringer value")),
		},
	}

	for _, p := range pairs {
		s0, s1 := attribute.NewSet(p[0]), attribute.NewSet(p[1])
		m := map[attribute.Set]struct{}{s0: {}}
		_, ok := m[s1]
		assert.Truef(t, ok, "%s not comparable", p[0].Value.Type())
	}
}
=== RUN   TestSetComparability
    set_test.go:248:
        	Error Trace:	set_test.go:248
        	Error:      	Should be true
        	Test:       	TestSetComparability
        	Messages:   	BOOLSLICE not comparable
    set_test.go:248:
        	Error Trace:	set_test.go:248
        	Error:      	Should be true
        	Test:       	TestSetComparability
        	Messages:   	INT64SLICE not comparable
    set_test.go:248:
        	Error Trace:	set_test.go:248
        	Error:      	Should be true
        	Test:       	TestSetComparability
        	Messages:   	INT64SLICE not comparable
    set_test.go:248:
        	Error Trace:	set_test.go:248
        	Error:      	Should be true
        	Test:       	TestSetComparability
        	Messages:   	FLOAT64SLICE not comparable
    set_test.go:248:
        	Error Trace:	set_test.go:248
        	Error:      	Should be true
        	Test:       	TestSetComparability
        	Messages:   	STRINGSLICE not comparable
--- FAIL: TestSetComparability (0.00s)
FAIL
FAIL	go.opentelemetry.io/otel/attribute	0.004s
FAIL

MrAlias avatar Sep 08 '22 16:09 MrAlias

I believe this can be solved inside the attribute package.

jmacd avatar Sep 08 '22 18:09 jmacd

I can't prioritize working on this right away. In case someone else wants to pick this up, here's roughly how the solution looks for just one type of slice (out of 5).

https://github.com/jmacd/opentelemetry-go/pull/new/jmacd/sampleslicesupport

diff --git a/attribute/value.go b/attribute/value.go
index 57899f68..1de57075 100644
--- a/attribute/value.go
+++ b/attribute/value.go
@@ -17,6 +17,7 @@ package attribute // import "go.opentelemetry.io/otel/attribute"
 import (
 	"encoding/json"
 	"fmt"
+	"reflect"
 	"strconv"
 
 	"go.opentelemetry.io/otel/internal"
@@ -66,11 +67,11 @@ func BoolValue(v bool) Value {
 
 // BoolSliceValue creates a BOOLSLICE Value.
 func BoolSliceValue(v []bool) Value {
-	cp := make([]bool, len(v))
-	copy(cp, v)
+	cp := reflect.New(reflect.ArrayOf(len(v), reflect.TypeOf(false)))
+	copy(cp.Elem().Slice(0, len(v)).Interface().([]bool), v)
 	return Value{
 		vtype: BOOLSLICE,
-		slice: &cp,
+		slice: cp.Elem().Interface(), // No longer a pointer
 	}
 }
 
@@ -159,10 +160,22 @@ func (v Value) AsBool() bool {
 // AsBoolSlice returns the []bool value. Make sure that the Value's type is
 // BOOLSLICE.
 func (v Value) AsBoolSlice() []bool {
-	if s, ok := v.slice.(*[]bool); ok {
-		return *s
+	rv := reflect.ValueOf(v.slice)
+	if rv.Type().Kind() != reflect.Array {
+		// it's not a slice
+		return nil
 	}
-	return nil
+	correctLen := rv.Len()
+	correctType := reflect.ArrayOf(correctLen, reflect.TypeOf(false))
+	if rv.Type() != correctType {
+		// it's a slice of the wrong kind
+		return nil
+	}
+	// we can't slice it yet, it's not addressable (by design,
+	// otherwise it wouldn't satisfy the requirements of Set).
+	cpy := reflect.New(correctType)
+	_ = reflect.Copy(cpy.Elem(), rv)
+	return cpy.Elem().Slice(0, correctLen).Interface().([]bool)
 }
 
 // AsInt64 returns the int64 value. Make sure that the Value's type is

As you can see, this is difficult to read. Potentially a generics-based solution would be easier to accept?

jmacd avatar Sep 08 '22 21:09 jmacd

Performance note: implied by the above, we have to copy the slice to read it. This isn't great. We could offer a different API to access the i'th element of a slice-valued attribute instead, that would allow the consumer of a Value to avoid copying.

jmacd avatar Sep 08 '22 21:09 jmacd

I can't prioritize working on this right away. In case someone else wants to pick this up, here's roughly how the solution looks for just one type of slice (out of 5).

https://github.com/jmacd/opentelemetry-go/pull/new/jmacd/sampleslicesupport

diff --git a/attribute/value.go b/attribute/value.go
index 57899f68..1de57075 100644
--- a/attribute/value.go
+++ b/attribute/value.go
@@ -17,6 +17,7 @@ package attribute // import "go.opentelemetry.io/otel/attribute"
 import (
 	"encoding/json"
 	"fmt"
+	"reflect"
 	"strconv"
 
 	"go.opentelemetry.io/otel/internal"
@@ -66,11 +67,11 @@ func BoolValue(v bool) Value {
 
 // BoolSliceValue creates a BOOLSLICE Value.
 func BoolSliceValue(v []bool) Value {
-	cp := make([]bool, len(v))
-	copy(cp, v)
+	cp := reflect.New(reflect.ArrayOf(len(v), reflect.TypeOf(false)))
+	copy(cp.Elem().Slice(0, len(v)).Interface().([]bool), v)
 	return Value{
 		vtype: BOOLSLICE,
-		slice: &cp,
+		slice: cp.Elem().Interface(), // No longer a pointer
 	}
 }
 
@@ -159,10 +160,22 @@ func (v Value) AsBool() bool {
 // AsBoolSlice returns the []bool value. Make sure that the Value's type is
 // BOOLSLICE.
 func (v Value) AsBoolSlice() []bool {
-	if s, ok := v.slice.(*[]bool); ok {
-		return *s
+	rv := reflect.ValueOf(v.slice)
+	if rv.Type().Kind() != reflect.Array {
+		// it's not a slice
+		return nil
 	}
-	return nil
+	correctLen := rv.Len()
+	correctType := reflect.ArrayOf(correctLen, reflect.TypeOf(false))
+	if rv.Type() != correctType {
+		// it's a slice of the wrong kind
+		return nil
+	}
+	// we can't slice it yet, it's not addressable (by design,
+	// otherwise it wouldn't satisfy the requirements of Set).
+	cpy := reflect.New(correctType)
+	_ = reflect.Copy(cpy.Elem(), rv)
+	return cpy.Elem().Slice(0, correctLen).Interface().([]bool)
 }
 
 // AsInt64 returns the int64 value. Make sure that the Value's type is

As you can see, this is difficult to read. Potentially a generics-based solution would be easier to accept?

I was able to validate this passes the above test. :+1:

MrAlias avatar Sep 08 '22 22:09 MrAlias

can i take this issue ? @jmacd @MrAlias

fatsheep9146 avatar Sep 21 '22 11:09 fatsheep9146

@fatsheep9146 yes, please!

jmacd avatar Sep 29 '22 17:09 jmacd