opentelemetry-collector-contrib
opentelemetry-collector-contrib copied to clipboard
journald - Consider parsing more known fields from logs
...
I'm looking at the journald input, which currently dumps everything into the body. The only thing it pulls out is the timestamp.
Ideally (IMHO), the body would be a string containing MESSAGE and all the other fields stored as attributes. Additionally, the priority would be converted to an OpenTelemetry severity. Given journald has a well defined list of fields [1], many of them could be converted to semantic conventions such as process.pid
.
So, the question is whether things like the journald input should convert messages to a more native opentelemetry format? If not, should there be an option, operator or processor that could do it all in one go?
(I can create a PR to change the journald input if there is agreement)
- https://www.freedesktop.org/software/systemd/man/systemd.journal-fields.html
Originally posted by @gregoryfranklin in https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/3129#issuecomment-1017782954
I would like to see journald logs to be parsed right OOTB. I don't know why it shouldn't be done right at the source without any additional operator or processor. Maybe we can provide an option to keep the raw message instead of parsing if user wants that?
I support that the operator ought to do further parsing as well. We already do this in syslog_input
to an extent. Some users may prefer the collector to do less parsing, but unless someone is asking for it, I think it's reasonable to add "don't parse" controls later as necessary.
Mapping from priority to severity, the body being only the MESSAGE string, and everything else going to attributes makes sense to me. I also agree with applying semantic conventions as they currently apply. In the long term, I assume we'll have semantic conventions for most/all the fields, and I would personally like to see this operator apply them wherever possible.
What's less clear to me is whether we should work ahead of the semantic conventions. Given that the library is not yet stable and that development can easily outpace semantic debate, I think it's probably reasonable to make a best effort at appropriately naming and then propose the attributes to the spec repo.
What thoughts do others have on all this?
I fully agree on the first two paragraphs.
What's less clear to me is whether we should work ahead of the semantic conventions. Given that the library is not yet stable and that development can easily outpace semantic debate, I think it's probably reasonable to make a best effort at appropriately naming and then propose the attributes to the spec repo.
I see your point, but from the other hand we can cause additional breaking changes to the attributes that the library emits if our best effort doesn't survive the semantic debate. I would go with the semantic conventions first, or in parallel, to make sure that once the change to the library is released at least attribute names are not going to change.
I'll start with a PR that moves all the fields to attributes instead of body and sets the severity.
So, it will change from this
"entry": {
"timestamp": "2020-04-16T11:05:49.516168-04:00",
"body": {
"CODE_FILE": "../src/core/unit.c",
"CODE_FUNC": "unit_log_success",
"CODE_LINE": "5487",
"MESSAGE": "var-lib-docker-overlay2-bff8130ef3f66eeb81ce2102f1ac34cfa7a10fcbd1b8ae27c6c5a1543f64ddb7-merged.mount: Succeeded.",
"MESSAGE_ID": "7ad2d189f7e94e70a38c781354912448",
"PRIORITY": "6",
to this
"entry": {
"timestamp": "2020-04-16T11:05:49.516168-04:00",
"body": "var-lib-docker-overlay2-bff8130ef3f66eeb81ce2102f1ac34cfa7a10fcbd1b8ae27c6c5a1543f64ddb7-merged.mount: Succeeded.",
"severity": 9,
"severity_text", "INFO",
"attributes": {
"CODE_FILE": "../src/core/unit.c",
"CODE_FUNC": "unit_log_success",
"CODE_LINE": "5487",
"MESSAGE_ID": "7ad2d189f7e94e70a38c781354912448",
Some of the fields are also clearly resource attributes rather than log attributes, but I think that is linked to the semantic conventions discussion so will leave for a later PR.
I agree it makes sense to wait for the discussion on semantic conventions to move further along before applying these to the library. In the spec there is an example mapping for syslog (and many other common logging formats) - it would be nice to see the same for journald.
Has there been any movement on this? If not, I would be happy to jump in. I have been working on integrating Journald into a few projects and it has been frustrating trying to move all fields over to attributes.
I don't see any reason why this shouldn't move forward. Putting this info in attributes is clearly an improvement in terms of respecting the data model.
I had a go at this a while back but the decision was to open an issue against the specification to get the missing semantic conventions defined. You'll see in the PR (https://github.com/open-telemetry/opentelemetry-log-collection/pull/353) that I identified all the well-defined fields from journald and which of them could be mapped to existing semantic conventions.
But, I've not had time to follow up on it in terms of opening an issue against the spec or anything.
There is an ongoing discussion about whether and how semantic conventions should be defined on a per component basis, so it is not even fully settled that these attributes will ever be codified. At this point, I don't think we should be waiting on semantic conventions. Many other components are moving forward with best effort attribute naming. This is an alpha components anyways, so names are not irreversible.
What we know for sure at this point is that this integration is not adhering to the data model's direction that first-party libraries should emit logs with the body as a string. We should correct this behavior.
Since adhering to the data model will require us to set attributes, we could either 1) follow the style and spirit of the semantic conventions, making a best effort to establish reasonable attribute names, or 2) use the raw values from journald (eg. _MACHINE_ID
). The semantic conventions provide detailed guidance on naming attributes, so I believe we should apply these as best we can to this domain.
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers
. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
This is still relevant, but should wait until a final decision on the ECS OTEP.
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers
. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
This is still relevant, but should wait until a final decision on the ECS OTEP.
Since ECS OTEP has been merged, does that change anything regarding this journald issue? I can't see anything about the ECS OTEP that would have an impact.
I am also interested in this. Should I revive the PR that @gregoryfranklin made?
Looks like I mistakenly believed ECS defined semantics for journald which we would use. I think we can work out our own semantics but should keep the component maturity level at alpha until we've defined semantic conventions for any attributes which we use.
Was there a Google Doc already to work on Journald semantics?
Was there a Google Doc already to work on Journald semantics?
I'm not aware of one.
Hi, I'm interested in this too :) @djaglowski @cwegener is there any work being done that it's not tracked? Anything that can be done to help out?
Hi, I'm interested in this too :) @djaglowski @cwegener is there any work being done that it's not tracked? Anything that can be done to help out?
I haven't done any work on this. I think the first step would be to review the fields in Journald and figure out how many of them should/can be mapped into OTel Log attributes. And if there is any impact on existing or not yet existing semantic conventions.
The available Journald fields are: https://www.freedesktop.org/software/systemd/man/latest/systemd.journal-fields.html
Sounds good. I'll try putting together a table unless somebody objects :) Once I've got a reasonable draft I'll open a documentation PR...
Sounds good. I'll try putting together a table unless somebody objects :) Once I've got a reasonable draft I'll open a documentation PR...
The old PR for this already had the first cut of mapping to semconv done, which might help. https://github.com/open-telemetry/opentelemetry-log-collection/pull/353/files#diff-913aee336b63830976ccc7a02800e0837017f628830bce49f88dc4aeabffdebe
I can't easily tell how much code of that old PR would actually still apply today though (e.g. I can't tell if the way how the severity mapping was done in the old PR would still how it should be done today)
(e.g. I can't tell if the way how the severity mapping was done in the old PR would still how it should be done today)
I've done some very early stages work, but I think it's clear already that the severity mapping from the earlier PR is valid (and quite stable, as those mappings are already in an appendix to the semantic conventions). Therefore, if anyone feels like it, I think it should be okay to split the old PR and merge the severity mappings already, as getting the fields in order will probably take a bit longer...
On this topic, is there an easy-ish way of doing this with transforms/existing tooling today?