Maturity for OTLP/JSON traces
I wanted to see what plans, if any, exist around classifying OTLP/JSON as stable for tracing. Is this something that would be possible to do with the tracing GA? If not, what would be the reservations or blockers?
from the issue triage mtg today, discussed this and triaging as P1 to decide whether the JSON portion of trace for OTLP is required for ga
I believe this was discussed in the past and decided that it is not a requirement for GA. I think we can stay with that decision unless there is any evidence that suggests OTLP/JSON is a necessary for GA.
Not knowing a use case but knowing that it severely limits our possibilities to change stuff around that would be wire-compatible for the binary serialization (e.g. attribute names, string vs bytes, submessage vs bytes, possiblity moving stuff into oneOf), I would even suggest to keep JSON permanently in alpha or beta. If one wants a stable protocol, one should use the binary protobuf format.
One complication with this approach, is that exporting from browser requires OTLP/JSON. There are also users that prefer JSON encoding over protobuf. See: https://github.com/open-telemetry/opentelemetry-specification/issues/678.
As another data point. To accommodate various user needs we have multiple versions of the collector exporter for JS:
- OTLP/JSON: https://github.com/open-telemetry/opentelemetry-js/tree/master/packages/opentelemetry-exporter-collector
- OTLP/Proto over HTTP: https://github.com/open-telemetry/opentelemetry-js/tree/master/packages/opentelemetry-exporter-collector-proto
- OTLP/gRPC: https://github.com/open-telemetry/opentelemetry-js/tree/master/packages/opentelemetry-exporter-collector-grpc
OTLP/JSON can be used from browser or node, but proto over HTTP and gRPC can only be used with node.
Thanks @mwear for the perspective. I think it is important to have OTLP/JSON declared stable. It may take a bit more time because it places more restrictions than Protobuf does, but I think we should do it eventually.
We are going to remove this as a GA requirement. This effectively bounces the JS browser SDK from GA as well, so we will work on this as our top tracing priority post-GA.
I don't have permissions to update the tags. @tedsuo do you?
Based on the maintainers call today, we have the following suggestions to make OTLP/JSON stable:
- Verify that current keys are JSON friendly.
- Ensure that we have a version header or another mechanism for receiving multiple versions of OTLP/JSON in the collector
- Agree to no longer use the deprecated delimiter in proto, as this breaks JSON.
- Verify that the current JSON protocol meets our needs.
- Implement JSON in two other languages to ensure we are satisfied.
We agreed to release v1.0 and GA tracing without this support finished, but we need to follow up on this as it effects how we approach protobuf development. To ensure our OTel Browser users have a goos experience in the meantime, I suggest that we avoid proto changes which would break JSON until this work is resolved.
Great suggestions, thanks.
Ensure that we have a version header or another mechanism for receiving multiple versions of OTLP/JSON in the collector
The thinking so far was that the major version is defined by the URL (/v1 in the OTLP/HTTP URLs). The minor versions should be backwards compatible in the same way Protobufs are compatible and should not require explicit version number to be in the payload. We can discuss if we need a different approach for versioning of OTLP/JSON.
This depends on https://github.com/open-telemetry/opentelemetry-proto/issues/268
Based on the maintainers call today, we have the following suggestions to make OTLP/JSON stable:
Verify that current keys are JSON friendly. Ensure that we have a version header or another mechanism for receiving multiple versions of OTLP/JSON in the collector Agree to no longer use the deprecated delimiter in proto, as this breaks JSON.
I just had a discussion with the team tech leads of our company. Our goal is to standardize the logs our microservices emit via stdout. We already agreed to use JSON and luckily came across OTEL. We also agreed, that OTEL and its semantic log data model are very promising. However, regarding the JSON format, which we know is in alpha, we see possible obstacles:
- The attributes names are rather long for example "severityNumber".
- Some of the attribute names contain dots. For example "http.url". This causes problems with system that use the dot as a separation for nested objects. For example for the navigation to a nested object. A possible solution would be to actually use nested objects or to use lower camel case "httpStatusCode".
- The timestamp attribute is defined as "uint64 nanoseconds since Unix epoch". This is not really readable if you encounter a raw log message for example in a intermediate file. RFC3339 would be a possible improvement and quite common for JSON. BTW uint64 values are troublesome in JSON. cf. https://github.com/open-telemetry/opentelemetry-proto/issues/268
- I would also vote for a version field in the logmessage because everything is in flow right now. And if raw JSON logstatements with different formats are mixed you don't know how to parse them. Perhaps you don't even know if they use otel at all. So something like
"schema":"otel0.0.1"might be helpful.
@mtestrot thanks for posting your requirements.
FYI, there is also a separate spec issue open to define a file format for OTLP data. The stdout format that you are looking for is more similar to the file format. The JSON format that this repo defines is for the OTLP network protocol (which includes extra things like request/response message, that do not necessarily make sense for files). The file and network formats are obviously related so most likely they will use the same format for the data.
Some of the attribute names contain dots. For example "http.url". This causes problems with system that use the dot as a separation for nested objects. For example for the navigation to a nested object. A possible solution would be to actually use nested objects or to use lower camel case "httpStatusCode".
This should not be a problem. in OTLP/JSON attribute names currently are not represented by object keys, they are string values, e.g. this is how a Resource with an attribute looks like:
"resource": {
"attributes": [
{
"key": "host.name",
"value": { "stringValue": "testHost" }
}
]
}
@tigrannajaryan thx for your quick answer. Could you share a link to the place where the file format for OLTP Data is discussed?
This should not be a problem. in OTLP/JSON attribute names currently are not represented by object keys, they are string values
I'm sure there is a good reason for this but I just don't see it. Why not just use JSON attributes?
"resource": {
"attributes":
{
"hostName": "testHost"
}
}
I really would like to understand the design decision.
@tigrannajaryan thx for your quick answer. Could you share a link to the place where the file format for OLTP Data is discussed?
Here https://github.com/open-telemetry/opentelemetry-specification/issues/1443
This should not be a problem. in OTLP/JSON attribute names currently are not represented by object keys, they are string values
I'm sure there is a good reason for this but I just don't see it. Why not just use JSON attributes? I really would like to understand the design decision.
The OTLP/JSON format is not manually designed (wiht minor exceptions which are not relevant to this discussions). What we get in OTLP/JSON is a consequence of 2 things:
- Attributes in OTLP Protobufs are defined as a list of Key/Value pairs, where Key and Value are just members of a KeyValue message. This is driven by general OpenTelemetry data model for attributes and by performance considerations for OTLP protocol.
- How Protobuf JSON encoding is defined for messages, which turns the Protobuf KeyValue message into a JSON object with 2 properties ("key" and "value"). This is not our decision, it is the behavior of Protofub JSON encoding.
Thanks a lot. This clarifies things 👍
Also somewhat related to https://github.com/open-telemetry/opentelemetry-specification/issues/1443 which also aims to define a JSON-based format to be stored in files. We don't want OTLP/JSON network protocol and OTLP/JSON format in file to be completely different, supposedly they should be mostly identical.
Fairly late reply, but re
How Protobuf JSON encoding is defined for messages, which turns the Protobuf KeyValue message into a JSON object with 2 properties ("key" and "value"). This is not our decision, it is the behavior of Protofub JSON encoding.
If a map<string, AnyValue> type was used instead of KeyValueList, the proto3 JSON mapping would render it as a normal JSON object:
| proto3 | JSON | JSON example | Notes |
|---|---|---|---|
| map<K,V> | object | {"k": v, …} |
All keys are converted to strings. |
I'm not certain, but I think using map<string, AnyValue> would be wire-compatible with a KeyValueList. The wire representation of a map<string, AnyValue> is equivalent to:
message MapFieldEntry {
string key = 1;
AnyValue value = 2;
}
repeated MapFieldEntry map_field = N;
where the MapFieldEntry tag values are the same as in KeyValue in otel proto.
This is now tracked in checklist issue https://github.com/open-telemetry/opentelemetry-specification/issues/1957 If anything is missing in the checklist please add there. Closing this issue to make sure we track it in one place.