Promscale as the Jaeger certified storage
Add integration tests for promscale in upstream jaeger so that we can claim promscale meets all the expectations.
EDIT
Summary
Storage integration test failures shall be broadly classified into the following categories,
- [x] Get Trace with >10K span in a single query - https://github.com/timescale/promscale/issues/1621#issuecomment-1238070108 (fixed by https://github.com/jaegertracing/jaeger/issues/3912)
- [x] Empty span name - https://github.com/timescale/promscale/issues/1621#issuecomment-1239182178
- [ ] Binary tags are returned as String tags - Jaeger to Otel translator encodes binary value with base64 - https://github.com/timescale/promscale/issues/1621#issuecomment-1239054206
- [x] Span x Event table join issue - https://github.com/timescale/promscale/issues/1621#issuecomment-1239073939
- [ ] Query tag which is spread across multiple events with in single trace/span - https://github.com/timescale/promscale/issues/1621#issuecomment-1240612369
Steps to run integration test
- Run promscale
- Apply the patch https://github.com/jaegertracing/jaeger/compare/main...arajkumar:jaeger:promscale-storage-int-tests.diff into jaeger
- Run the following from jaeger repo
go clean -testcache && (cd examples/memstore-plugin/ && go build .) && STORAGE=grpc-plugin go test ./plugin/storage/integration -tags grpc_storage_integration -run="TestExternalGRPCStorage" -v
With little bit of tinkering, I could run the upstream jaeger integration tests.
t.Run("GetServices", s.testGetServices)
t.Run("GetOperations", s.testGetOperations)
t.Run("GetTrace", s.testGetTrace)
t.Run("GetLargeSpans", s.testGetLargeSpan)
t.Run("FindTraces", s.testFindTraces)
t.Run("GetDependencies", s.testGetDependencies)
in the above list, first 3 are running(with little modification), but other tests will need some changes in promscale.
t.Run("GetLargeSpans", s.testGetLargeSpan)
The above test ingests 10K spans under a single trace id and get them in a single GetTraces operation.
Test execution summary:
--- FAIL: TestExternalGRPCStorage (212.15s)
--- PASS: TestExternalGRPCStorage/GetServices (0.21s)
--- PASS: TestExternalGRPCStorage/GetOperations (0.00s)
--- PASS: TestExternalGRPCStorage/GetTrace (0.01s)
--- PASS: TestExternalGRPCStorage/GetTrace/NotFound_error (0.00s)
--- FAIL: TestExternalGRPCStorage/FindTraces (211.93s)
--- FAIL: TestExternalGRPCStorage/FindTraces/Tags_in_one_spot_-_Tags (10.72s)
--- FAIL: TestExternalGRPCStorage/FindTraces/Tags_in_one_spot_-_Logs (10.70s)
--- FAIL: TestExternalGRPCStorage/FindTraces/Tags_in_one_spot_-_Process (10.64s)
--- FAIL: TestExternalGRPCStorage/FindTraces/Tags_in_different_spots (10.64s)
--- FAIL: TestExternalGRPCStorage/FindTraces/Trace_spans_over_multiple_indices (10.39s)
--- PASS: TestExternalGRPCStorage/FindTraces/Operation_name (0.01s)
--- FAIL: TestExternalGRPCStorage/FindTraces/Operation_name_+_max_Duration (10.43s)
--- FAIL: TestExternalGRPCStorage/FindTraces/Operation_name_+_Duration_range (10.37s)
--- FAIL: TestExternalGRPCStorage/FindTraces/Duration_range (10.37s)
--- FAIL: TestExternalGRPCStorage/FindTraces/max_Duration (10.40s)
--- FAIL: TestExternalGRPCStorage/FindTraces/default (10.40s)
--- FAIL: TestExternalGRPCStorage/FindTraces/Tags_+_Operation_name (10.69s)
--- FAIL: TestExternalGRPCStorage/FindTraces/Tags_+_Operation_name_+_max_Duration (10.74s)
--- FAIL: TestExternalGRPCStorage/FindTraces/Tags_+_Operation_name_+_Duration_range (0.02s)
--- FAIL: TestExternalGRPCStorage/FindTraces/Tags_+_Duration_range (10.81s)
--- FAIL: TestExternalGRPCStorage/FindTraces/Tags_+_max_Duration (10.69s)
--- FAIL: TestExternalGRPCStorage/FindTraces/Multi-spot_Tags_+_Operation_name (10.81s)
--- FAIL: TestExternalGRPCStorage/FindTraces/Multi-spot_Tags_+_Operation_name_+_max_Duration (10.76s)
--- FAIL: TestExternalGRPCStorage/FindTraces/Multi-spot_Tags_+_Operation_name_+_Duration_range (10.70s)
--- FAIL: TestExternalGRPCStorage/FindTraces/Multi-spot_Tags_+_Duration_range (10.70s)
--- FAIL: TestExternalGRPCStorage/FindTraces/Multi-spot_Tags_+_max_Duration (10.56s)
--- FAIL: TestExternalGRPCStorage/FindTraces/Multiple_Traces (10.39s)
--- SKIP: TestExternalGRPCStorage/GetDependencies (0.00s)
FAIL: TestExternalGRPCStorage/FindTraces/Tags_in_one_spot_-_Tags (10.72s)
The current find trace query does a INNER JOIN on span & event which would cause failure to find a trace which doesn't have any event. We must use LEFT JOIN.
EDIT:
Event after LEFT JOIN fix, this test still fails because of diff in the JSON output.
Actually the span has an attribute with a type as "BINARY" which gets translated into base64 encoded string[1] while storing in DB, when jaeger queries it is returned as a string.
[1] https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/84ec43271330c6e1bfaf5c1483042de04798fd10/pkg/translator/jaeger/jaegerproto_to_traces.go#L263
When I use LEFT JOIN for event and span table merge, I see the following tests are passing additionally but above listed problem remains for other tests,
TestExternalGRPCStorage/FindTraces/Tags_in_different_spots
TestExternalGRPCStorage/FindTraces/Operation_name_+_Duration_range
TestExternalGRPCStorage/FindTraces/Duration_range
TestExternalGRPCStorage/FindTraces/max_Duration
Few more are passing after adding a second delay in grpc_test Refresh function. Looks like a bug in somewhere in the grpc plugin implementation(Note tracing.asyc-acks are not enabled).
--- PASS: TestExternalGRPCStorage/FindTraces/Tags_in_different_spots (0.00s) --- PASS: TestExternalGRPCStorage/FindTraces/Trace_spans_over_multiple_indices (0.00s) --- PASS: TestExternalGRPCStorage/FindTraces/Operation_name (0.00s) --- PASS: TestExternalGRPCStorage/FindTraces/Operation_name_+max_Duration (0.00s) --- PASS: TestExternalGRPCStorage/FindTraces/Operation_name+_Duration_range (0.11s) --- PASS: TestExternalGRPCStorage/FindTraces/Duration_range (0.11s) --- PASS: TestExternalGRPCStorage/FindTraces/max_Duration (0.11s)
EDIT:
It uses Streaming Span writer which is async.
--- FAIL: TestExternalGRPCStorage/FindTraces/default (10.40s)
This one fails when operationName field of Span is empty.
EDIT:
Following error is thrown when OperationName is empty,
ERROR: new row for relation "operation" violates check constraint "operation_span_name_check" (SQLSTATE 23514)
EDIT:
operations table has a constraint on the span_name column which prevents empty insertions.
EDIT:
I could think of 2 ways to fix the problem.
- OTLP doesn't really mandate non empty value for span name, it just semantically required and empty value is equivalent to an unknown span name. Hence we can drop the constrains in the db side.
- In Promscale, set default value if span name is empty and remove the same from response.
--- FAIL: TestExternalGRPCStorage/FindTraces/Multi-spot_Tags_+Operation_name (10.81s) --- FAIL: TestExternalGRPCStorage/FindTraces/Multi-spot_Tags+Operation_name+max_Duration (10.76s) --- FAIL: TestExternalGRPCStorage/FindTraces/Multi-spot_Tags+Operation_name+Duration_range (10.70s) --- FAIL: TestExternalGRPCStorage/FindTraces/Multi-spot_Tags+Duration_range (10.70s) --- FAIL: TestExternalGRPCStorage/FindTraces/Multi-spot_Tags+_max_Duration (10.56s)
This has a tag query across span, resource and event. It fails when you have 2 events with 2 distinct tags and both of them are queried at a time.
e.g.
| trace_id | span_id | event_name | tag |
|---|---|---|---|
| 0001 | 0007 | event1 | foo=foo |
| 0001 | 0007 | event2 | bar=bar |
When you query foo=foo bar=bar, the following WHERE clause would be generated by the promscale which would return empty result because both of the condition won't meet.
e.tags @> ANY($3::jsonb[]) AND e.tags @> ANY($5::jsonb[])
It also depends on fix from https://github.com/timescale/promscale/issues/1621#issuecomment-1239054206
The problem with GetLargeSpans is with the array_agg.
The same span is sent 10008 times in the test:
What this means is that we end up with 10008 entries in _ps_trace.span that are identical:
promscale=# select * from _ps_trace.span;
─[ RECORD 1 ]───────────────┬─────────────────────────────────────
trace_id │ 00000000-0000-0000-0000-000000000011
span_id │ 3
parent_span_id │ ∅
operation_id │ 1
start_time │ 2022-09-12 19:03:11.639875+02
end_time │ 2022-09-12 19:03:11.639975+02
duration_ms │ 0.1
instrumentation_lib_id │ ∅
resource_schema_url_id │ ∅
event_time │ empty
dropped_tags_count │ 0
dropped_events_count │ 0
dropped_link_count │ 0
resource_dropped_tags_count │ 0
status_code │ unset
trace_state │ ∅
span_tags │ {}
status_message │
resource_tags │ {"1": 1}
═[ RECORD 2 ]═══════════════╪═════════════════════════════════════
trace_id │ 00000000-0000-0000-0000-000000000011
span_id │ 3
parent_span_id │ ∅
operation_id │ 1
start_time │ 2022-09-12 19:03:22.639875+02
end_time │ 2022-09-12 19:03:22.639975+02
duration_ms │ 0.1
instrumentation_lib_id │ ∅
resource_schema_url_id │ ∅
event_time │ empty
dropped_tags_count │ 0
dropped_events_count │ 0
dropped_link_count │ 0
resource_dropped_tags_count │ 0
status_code │ unset
trace_state │ ∅
span_tags │ {}
status_message │
resource_tags │ {"1": 1}
If you look at the span fixture it has 2 empty (except for timestamp) entries in logs, the translations between jaeger and otel stores this log entries as events. Looking at the _ps_trace.event table we got the same 2 events duplicated 10008 times each (same number of spans that were sent).
promscale=# select count(*) from _ps_trace.event;
count
═══════
20016
promscale=# select * from _ps_trace.event limit 4;
time │ trace_id │ span_id │ event_nbr │ dropped_tags_count │ name │ tags
═══════════════════════════════╪══════════════════════════════════════╪═════════╪═══════════╪════════════════════╪══════╪══════
2022-09-12 18:46:31.639875+02 │ 00000000-0000-0000-0000-000000000011 │ 3 │ 0 │ 0 │ │ {}
2022-09-12 18:46:31.639875+02 │ 00000000-0000-0000-0000-000000000011 │ 3 │ 1 │ 0 │ │ {}
2022-09-12 18:46:31.639875+02 │ 00000000-0000-0000-0000-000000000011 │ 3 │ 0 │ 0 │ │ {}
2022-09-12 18:46:31.639875+02 │ 00000000-0000-0000-0000-000000000011 │ 3 │ 1 │ 0 │ │ {}
When we try to get the trace we use array_agg to get the events since all the spans have the same values, the query aggregates the 20016 events to every row.
─[ RECORD 1 ]────────────┬──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
trace_id │ 00000000-0000-0000-0000-000000000011
span_id │ 3
parent_span_id │ ∅
start_times │ 2022-09-12 18:46:31.639875+02
end_times │ 2022-09-12 18:46:31.639975+02
span_kind │ unspecified
dropped_tags_counts │ 0
dropped_events_counts │ 0
dropped_link_counts │ 0
status_code_string │ unset
status_message │
trace_states │ ∅
schema_urls │ ∅
span_names │ example-operation-1
resource_tags │ {"service.name": "example-service-1"}
span_tags │ ∅
event_names │ {"","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","",""
event_times │ {"2022-09-12 18:46:31.639875+02","2022-09-12 18:46:31.639875+02","2022-09-12 18:46:31.639875+02","2022-09-12 18:46:31.639875+02","2022-09-12 18:46:31.639875+02","2022-09-12 18:46:31.639875+02","2022-09-12 18:46:31
event_dropped_tags_count │ {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
event_tags │ {NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NU
library_name │ ∅
library_version │ ∅
library_schema_url │ ∅
links_linked_trace_ids │ ∅
links_linked_span_ids │ ∅
links_trace_states │ ∅
links_dropped_tags_count │ ∅
links_tags │ ∅
Related to GetLargeSpans
If I give each Span in the test a unique ID the test passes. Maybe we can add this change to the test. We can use the following arguments to state that the current test is not very representative of a real scenario:
The opentelemtry spec defines the SpanID
SpanId is the identifier for a span. It is globally unique with practically sufficient probability by being made as 8 randomly generated bytes. When passed to a child Span this identifier becomes the parent span id for the child Span
The Jaeger Thrift model states:
3: required i64 spanId # unique span id (only unique within a given trace)
Also, Jaeger has a depuper for non unique spanIDs, because as the documentation says:
Jaeger UI expects all spans to have unique IDs.
I guess the other option is handling duplicates ourselves.
GetLargeSpans fixed by https://github.com/jaegertracing/jaeger/issues/3912
The problems with TestExternalGRPCStorage/FindTraces/Tags_+_Operation_name are:
Spans in fixture have different trace_id:
This test sends a span (a) from a trace and has 3 references to 2 other spans (b and c):
- a CHILD_OF b
- a CHILD_OF of c
- a FOLLOWS_FROM c
https://github.com/jaegertracing/jaeger/blob/cac21f82950545b613f45d3984d96e1498bfd15d/plugin/storage/integration/fixtures/traces/tags_opname_trace.json#L4-L23
The problem is with the child relationship and the fact that each span has a different traceId. This breaks the parent relationship because when the translator tries to get the parent span it cannot find a match given that all of the spans have different traces, the ParentSpanID method makes the following assertion ref.TraceID == s.TraceID && ref.RefType == ChildOf :
https://github.com/jaegertracing/jaeger/blob/cac21f82950545b613f45d3984d96e1498bfd15d/model/span.go#L99-L109
I think for this point we can propose a fix to the test. Like we did for the GetLargeSpan propose a more realistic scenario.
OTEL doesn't have a concept of refType
The OTEL model doesn't define a span reference type (in the case of OTEL it would be link). When translating from otel to jaeger the parent span is assign the CHILD_OF refType, and all other links are assigned FOLLOWS_FROM:
https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/7b5eb5280c15a5cc195d68757ba4c24c2c7f8eba/pkg/translator/jaeger/traces_to_jaegerproto.go#L260-L281
They explicitly say:
Since Jaeger RefType is not captured in internal data, use SpanRefType_FOLLOWS_FROM by default. SpanRefType_CHILD_OF supposed to be set only from parentSpanID.
This is a problem for the test because it's only setting one CHILD_OF when the test expects 2 from the result.
The expected based on the fixture is:
"references": [
{
"trace_id": "AAAAAAAAAAAAAAAAAAAAEg==",
"span_id": "AAAAAAAAAP8="
},
{
"trace_id": "AAAAAAAAAAAAAAAAAAAAEg==",
"span_id": "AAAAAAAAAAI="
},
{
"trace_id": "AAAAAAAAAAAAAAAAAAAAAQ==",
"span_id": "AAAAAAAAAAI=",
"ref_type": 1
}
],
The first 2 are ref_type=0 which is CHILD_OF.
The actual result we are getting because the translator sets only one CHILD_OF is:
"references": [
{
"trace_id": "AAAAAAAAAAAAAAAAAAAAEg==",
"span_id": "AAAAAAAAAP8="
},
{
"trace_id": "AAAAAAAAAAAAAAAAAAAAEg==",
"span_id": "AAAAAAAAAAI=",
"ref_type": 1
},
{
"trace_id": "AAAAAAAAAAAAAAAAAAAAAQ==",
"span_id": "AAAAAAAAAAI=",
"ref_type": 1
}
],
As you can see, since the translator takes the first CHILD_OF for the span parent id all the other relationships are FOLLOWS_FROM.
Again, we can ask for a more realistic test which sets only one parent for the span.
Aggregate for links are not returning data
The third problem is that the the query doesn't seem to be returning all the links properly. The aggregate function filter clause doesn't seem to be working correctly.
We are doing FILTER(WHERE lk IS NOT NULL) but that seems to be filtering out everything. Maybe the WHERE clause requires and explicit column
SELECT
s.trace_id,
s.span_id,
array_agg(lk.linked_trace_id ORDER BY lk.link_nbr) FILTER(WHERE lk.trace_id IS NOT NULL) links_lkTraceId_not_null,
array_agg(lk.linked_trace_id ORDER BY lk.link_nbr) FILTER(WHERE lk IS NOT NULL) links_lk_not_null
FROM
_ps_trace.span s
LEFT JOIN
_ps_trace.link lk ON lk.trace_id = s.trace_id AND lk.span_id = s.span_id
GROUP BY
s.trace_id,
s.span_id;
trace_id │ span_id │ links_lktraceid_not_null │ links_lk_not_null
══════════════════════════════════════╪═════════╪════════════════════════════════════════╪═══════════════════
00000000-0000-0000-0000-000000000001 │ 4 │ {00000000-0000-0000-0000-000000000001} │ ∅
(1 row)
Time: 101.343 ms```
GetLargeSpans fixed by jaegertracing/jaeger#3912
GetLargeSpans test still fails when it is run with all other tests specifically GetTrace. Here is a test summary,
/plugin/storage/integration/integration_test.go:133
Error: Not equal:
expected: 10008
actual : 10012
Test: TestExternalGRPCStorage/GetLargeSpans
The actual problem is that both GetLargeSpans and GetTrace ingests test trace with same trace id and span id and end up getting more the expected number of results.
Hmm, the expectation is to run tests in isolation. See https://github.com/jaegertracing/jaeger/blob/4fc291568d8ac59a1c67cc47ee1d91ab20dd06c4/plugin/storage/integration/integration_test.go#L119
Hmm, the expectation is to run tests in isolation. See https://github.com/jaegertracing/jaeger/blob/4fc291568d8ac59a1c67cc47ee1d91ab20dd06c4/plugin/storage/integration/integration_test.go#L119
Fixed in https://github.com/timescale/promscale/pull/1627/commits/f02d9a8c3ebc3efc3619d89931edee69bba74824
All tests are passing now.