Add event.summary as a recommendation for events
Changes
Events are identified by an event.name and attributes and fields in the body that carry specific meaning. However, since these events will be combined with other logs, event.summary allows a backend to display a human-readable representation of the event.
Merge requirement checklist
- [X] CONTRIBUTING.md guidelines followed.
- [X] Change log entry added, according to the guidelines in When to add a changelog entry.
- If your PR does not need a change log, start the PR title with
[chore] - [ ] schema-next.yaml updated with changes to existing conventions.
Proposed PR for issue https://github.com/open-telemetry/semantic-conventions/issues/1076
As discussed in the Eventing SIG on 2024-05-24 cc: @MSNev @trask @JamieDanielson
Why are we trying to allow a backend to display something human readable? Backends don't typically display much anyway. It sounds like this use case is already covered by structured logging, and I don't think we should muddy Events with a
messagefield whose human-readable content could be misleading or otherwise conflict the semantics of the event itself.
Maybe the definition of backend is not clear and could be better specified. I meant as backend logging backends (like DataDog, Google Logging, Elastic, ... those are the systems I have experience with), because they display the logs to the user: For us (as an end-user) it's important to mix our normal logs without events in the same stream. We use the convention of message in our structured logs, so when our teams do exploratory work they don't see blurps of JSON inter-mingeled with normal logs lines.
We defined our internal business event based ontop of structured logs with success (as we mandate the message field in the body). Imagine the backlash we'll get that when we upgrade to OTel events that the experience for the user degrades. That will not help adoption.
We defined our internal business event based ontop of structured logs with success (as we mandate the
messagefield in the body). Imagine the backlash we'll get that when we upgrade to OTel events that the experience for the user degrades. That will not help adoption.
Maybe an extra note: I'm afraid that if we cannot leverage the fact that logs are also meant for human consumption, for events, my feeling on the fact that events are built on top of logs will flip to unsupportive (we will keep on using our events based on structured logs). Then, it would probably have been better to make logs their own signal. At least then, if that signal had a message/description in the proto, at least a log-backend could also consume those signals and do something with it.
@MSNev I've removed the workflow started example, and added examples parallel to the examples in event.name
@trask I've changed the PR to change event.message -> event.description (It very much aligns with how a metric is defined, it also has a name and description). Some systems may display the description, others not.
I think I could be talked into this if there was wording to say something like "If the event.name for the event is well known and the structure and semantics of the event are described by OpenTelemetry, then the description field SHOULD NOT be populated."
one potential concern with the name event.description is that it's not quite analogous to metric "description" since metric description is a constant value as opposed to this new proposal which varies per event instance
As asked in the Eventing SIG, I'll add a few redacted events with its message/summary/description:
Event: import:dgc_asset_import Summary: Exported 1057 nodes from 166 roots in 0.051s
{
"message": "Exported 1057 nodes from 166 roots in 0.051s",
"event_class": "generic",
"event_name": "export:dgc_asset_export",
"root_nodes_count": 166,
"authenticated_id": "REDACTED",
"duration": 51,
"output_format": "JSON",
"root_resource_type": "terms",
"c4_container": "backend",
"max_page_nodes_count": 1057,
"send_notification": false,
"query_limit": 200,
"resource_types": [
"vocabulary",
"terms",
"attributes",
"source",
"type",
"status",
"target"
],
"timestamp": "2024-06-07T19:11:44.563Z",
"data_size": 404475,
"trace_id": "8853d755645a95f3809341f2c41ed92c",
"output_type": "TEXT",
"all_nodes_count": 1057,
"span_id": "bcad170d7e71c0d0",
"count_limit": 10001,
"query_type": "TABLE_VIEW_CONFIG",
"validation_enabled": false,
"request_source": "JAVA_API",
"valid_query": true,
"c4_system": "knowledgegraph",
"status": "INFO"
}
Event: import:dgc_asset_import Summary: Asset import completed with result: SUCCESS
{
"message": "Asset import completed with result: SUCCESS",
"operation_type": "IMPORT",
"community_by_mapping_count": 0,
"simulation": false,
"resources_updated": {
"MP": 5
},
"job_state": "COMPLETED",
"authenticated_id": "REDACTED",
"duration": 25,
"community_by_name_count": 0,
"domain_by_id_count": 0,
"asset_by_name_count": 0,
"domain_by_mapping_count": 0,
"file_type": "JSON",
"job_result": "SUCCESS",
"send_notification": false,
"continue_on_error": false,
"timestamp": "2024-06-07T19:11:44.568Z",
"commands_executed": 0,
"trace_id": "bcdfe64836f084dac3edd79b33546fd4",
"span_id": "6d3936b98a8181e4",
"community_by_id_count": 0,
"domain_by_name_count": 0,
"asset_by_mapping_count": 5,
"request_source": "REST_API",
"commands_skipped": 0,
"save_result": false,
"asset_by_id_count": 0,
"service": "dgc",
"job_id": "REDACTED",
"event_class": "generic",
"event_name": "import:dgc_asset_import",
"commands_count_in_request": 7,
"errors_count": 0,
"status": "INFO"
}
Event: catalogjdbc:edge_jdbc_connect Summary: connection to the datasource has been established
{
"message": "connection to the datasource has been established",
"database_name": "REDACTED",
"timing": {
"total_calls": 2,
"time_unit": "MILLISECONDS",
"max_time": 16994,
"mean_time": 8606,
"total_time": 17212
},
"schema_connection_id": "REDACTED",
"schema_name": "REDACTED",
"driver_class_name": "REDACTED",
"duration": 17212,
"tmp_msg": "",
"database_id": "REDACTED",
"engine": "JAVA_API",
"service": "jdbc-ingestion",
"schema_id": "REDACTED",
"event_name": "catalogjdbc:edge_jdbc_connect",
"event_class": "generic",
"correlation_id": "REACTED",
"status": "INFO",
"timestamp": "2024-06-07T19:27:10.826Z"
}
Thanks for the examples, @alexvanboxel . I think these do a good job of demonstrating what these are for, and I think they also demonstrate:
- The data in the message is redundant with the data contained in the event body. In the first example, node count, root count, and time are all fields within the event. In the second example, the message effectively restates the
job_state(COMPLETED) andjob_result(SUCCESS). In the third example, the message seems to be explaining the semantic behind the event name itself. I definitely understand that this can be useful to a human operator who might otherwise be unfamiliar with the event structure....however, I think in practice this is better solved by having schema-aware (event-specific)toString()type implementations. Because... - The free-form nature of the message field opens itself up to inaccurate or misleading information! And because the field is just any text, it is impossible to constrain/limit the contents via conventions or specification. For example, there's nothing to prevent the message in the second example from reading
"Asset import failed with result: MEGAFAILURE", even though this directly conflicts with the true event content.
I will remove my blocking request, because I think there isn't too much harm in this, but unfortunately I cannot offer a supportive approval. I ask that other reviewers please consider some phrasing that discourages the use of the message/description/summary for well-specified otel events.
I also still think that an attribute is a reasonable place to put a summary of the event, and we don't need it to be first-class Event field.
Thanks for the examples, @alexvanboxel . I think these do a good job of demonstrating what these are for, and I think they also demonstrate:
- The data in the message is redundant with the data contained in the event body. In the first example, node count, root count, and time are all fields within the event. In the second example, the message effectively restates the
job_state(COMPLETED) andjob_result(SUCCESS). In the third example, the message seems to be explaining the semantic behind the event name itself. I definitely understand that this can be useful to a human operator who might otherwise be unfamiliar with the event structure....however, I think in practice this is better solved by having schema-aware (event-specific)toString()type implementations. Because...- The free-form nature of the message field opens itself up to inaccurate or misleading information! And because the field is just any text, it is impossible to constrain/limit the contents via conventions or specification. For example, there's nothing to prevent the message in the second example from reading
"Asset import failed with result: MEGAFAILURE", even though this directly conflicts with the true event content.I will remove my blocking request, because I think there isn't too much harm in this, but unfortunately I cannot offer a supportive approval. I ask that other reviewers please consider some phrasing that discourages the use of the message/description/summary for well-specified otel events.
I also still think that an attribute is a reasonable place to put a summary of the event, and we don't need it to be first-class Event field.
Thanks for the feedback. I've added a section that I hope formulates your concerns. I've also renamed everything to summary as this is till now the best name that was suggested.
one potential concern with the name
event.descriptionis that it's not quite analogous to metric "description" since metric description is a constant value as opposed to this new proposal which varies per event instance
Yes, I think summary is the best name suggested.
event.summaryallows a backend to display a human-readable representation of the event
does the same issue exist for how to display a human-readable representation of a structured log?
Had a discussion about the summary attribute today in the Event SIG. It's a very useful field, Alex's example above (https://github.com/open-telemetry/semantic-conventions/pull/1074#issuecomment-2155453445) really shows that.
One question is whether the content of the summary attribute should be strictly defined as part of the semantic convention for the event. We (Me, Nev, Trask) feel that the value should not be strictly defined. Like severity, the event summary is context specific: whatever message the person writing the event wants to communicate to the person reading the event is fine.
For example, a commonly defined database event in a particular library may want to include a library-specific summary. An end user tracking down a production problem may flag a particular failure or exception as suspicious and mention some context specific information in the summary.
Because the content of this attribute is not intended to be indexed and used as part of machine analysis, strictly defining its content is not necessary. This is different from the Span name field, which is used both as a primary index and a UI label and thus ends up creating a lot of fuss about its structure and content. I appreciate having the concept of a summary field that has no purpose other than to deliver a short message from one human to another.
Personally, I would prefer summary to be a proto field, not an attribute, in order to keep it completely separate from the attributes which are intended to be used as indexes. But I'm not going to die on that hill, I don't think that there is real harm in keeping it as an attribute. :)
I would also agree to not making it strongly typed and instead would propose an attribute for the outcome as suggested in #1089
This PR was marked stale due to lack of activity. It will be closed in 7 days.
Closed as inactive. Feel free to reopen if this PR is still being worked on.
@alexvanboxel Do you want this reopened?
@alexvanboxel Do you want this reopened?
yes, please. I've been on holiday and couldn't (we'll it holiday) pick this up. I can pick it up again.
This PR was marked stale due to lack of activity. It will be closed in 7 days.
In looking at this again, my understanding of the general consensus so far is that having event.summary as an optional property on events is fine, and in some cases especially useful for human operators to get a succinct summary of valuable details within the event. The open questions I can see are:
- Should this be a top-level field as it is here, or should it be an attribute?
- What is the expectation for "well-known" events in OTel? Should summary be optional, required, or "disallowed"?
Is there anything else I am missing that needs to be resolved for this?
The biggest question for me is should the summary be strongly typed ie an enum or free text
This PR was marked stale due to lack of activity. It will be closed in 7 days.
This PR was marked stale due to lack of activity. It will be closed in 7 days.
Closed as inactive. Feel free to reopen if this PR is still being worked on.