fluentd
fluentd copied to clipboard
Error event logs should include plugin id
Describe the bug
The @id plugin parameter is used to give each plugin its unique id. This id should be included in logs generated by said plugin. This seems to work inconsistent between different plugin types. I tested the json filter plugin (for which the id is not shown) and the loki output plugin (which shows it)
working example:
<source>
@type sample
@id id_sample_too_long
sample {"data":"this log is way to long to fit into one chunk"}
tag sample.long
</source>
<match **>
@type loki
@id id_too_long_message
url ...
...
<buffer>
@type file
chunk_limit_size 5
...
</buffer>
</match>
results in logs:
2024-07-26 13:28:37 +0000 [warn]: #0 [id_too_long_message] chunk bytes limit exceeds for an emitted event stream: 64bytes
2024-07-26 13:28:37.073654307 +0000 fluent.warn: {"message":"[id_too_long_message] chunk bytes limit exceeds for an emitted event stream: 64bytes"}
Both contain the "id_too_long_message" id of the plugin the error occured in.
But for the filter plugin:
<source>
@type sample
@id id_sample_not_json
sample {"data":"this is not json}"}
tag sample.json
</source>
<filter sample.json>
@type parser
@id id_not_json
key_name data
<parse>
@type json
</parse>
</filter>
The error:
2024-07-26 13:21:57 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::Parser::ParserError error="pattern not matched with data 'this is not json}'" location=nil tag="sample.json" time=2024-07-26 13:21:57.079027108 +0000 record={"data"=>"this is not json}"}
2024-07-26 13:21:57.079191057 +0000 fluent.warn: {"error":"#<Fluent::Plugin::Parser::ParserError: pattern not matched with data 'this is not json}'>","location":null,"tag":"sample.json","time":1722000117,"record":{"data":"this is not json}"},"message":"dump an error event: error_class=Fluent::Plugin::Parser::ParserError error=\"pattern not matched with data 'this is not json}'\" location=nil tag=\"sample.json\" time=2024-07-26 13:21:57.079027108 +0000 record={\"data\"=>\"this is not json}\"}"}
does not contain the id "id_not_json" of the filter plugin that caused the error.
To Reproduce
The config:
<source>
@type sample
@id id_sample_not_json
sample {"data":"this is not json}"}
tag sample.json
</source>
<filter sample.json>
@type parser
@id id_not_json
key_name data
<parse>
@type json
</parse>
</filter>
results in:
2024-07-26 13:21:57 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::Parser::ParserError error="pattern not matched with data 'this is not json}'" location=nil tag="sample.json" time=2024-07-26 13:21:57.079027108 +0000 record={"data"=>"this is not json}"}
2024-07-26 13:21:57.079191057 +0000 fluent.warn: {"error":"#<Fluent::Plugin::Parser::ParserError: pattern not matched with data 'this is not json}'>","location":null,"tag":"sample.json","time":1722000117,"record":{"data":"this is not json}"},"message":"dump an error event: error_class=Fluent::Plugin::Parser::ParserError error=\"pattern not matched with data 'this is not json}'\" location=nil tag=\"sample.json\" time=2024-07-26 13:21:57.079027108 +0000 record={\"data\"=>\"this is not json}\"}"}
Expected behavior
The @id fiel, if set, shows up in logs prodcued by the plugin. In this example I would expect:
2024-07-26 13:21:57 +0000 [warn]: #0 [id_not_json] dump an error event: error_class=Fluent::Plugin::Parser::ParserError error="pattern not matched with data 'this is not json}'" location=nil tag="sample.json" time=2024-07-26 13:21:57.079027108 +0000 record={"data"=>"this is not json}"}
2024-07-26 13:21:57.079191057 +0000 fluent.warn: {"error":"# [id_not_json] <Fluent::Plugin::Parser::ParserError: pattern not matched with data 'this is not json}'>","location":null,"tag":"sample.json","time":1722000117,"record":{"data":"this is not json}"},"message":"dump an error event: error_class=Fluent::Plugin::Parser::ParserError error=\"pattern not matched with data 'this is not json}'\" location=nil tag=\"sample.json\" time=2024-07-26 13:21:57.079027108 +0000 record={\"data\"=>\"this is not json}\"}"}
Your Environment
- Fluentd version: 1.17.0
- Package version:
- Operating system: Garden Linux 1443.9
- Kernel version: 6.6.35-cloud-arm64
Your Configuration
<source>
@type sample
@id id_sample_not_json
sample {"data":"this is not json}"}
tag sample.json
</source>
<filter sample.json>
@type parser
@id id_not_json
key_name data
<parse>
@type json
</parse>
</filter>
### Your Error Log
```shell
2024-07-26 13:21:57 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::Parser::ParserError error="pattern not matched with data 'this is not json}'" location=nil tag="sample.json" time=2024-07-26 13:21:57.079027108 +0000 record={"data"=>"this is not json}"}
2024-07-26 13:21:57.079191057 +0000 fluent.warn: {"error":"#<Fluent::Plugin::Parser::ParserError: pattern not matched with data 'this is not json}'>","location":null,"tag":"sample.json","time":1722000117,"record":{"data":"this is not json}"},"message":"dump an error event: error_class=Fluent::Plugin::Parser::ParserError error=\"pattern not matched with data 'this is not json}'\" location=nil tag=\"sample.json\" time=2024-07-26 13:21:57.079027108 +0000 record={\"data\"=>\"this is not json}\"}"}
### Additional context
_No response_
@DavidKnacker is the id missing even for logs generated by input plugin?
@Athishpranav2003 I haven't checked it for error logs thrown by an input plugin yet. But am I correct to assume that using @id to track errors should work for all kind of plugins? (or at least the ones that are shipped by default with fluentd?)
Ideally yes but since it's not behaving in that way I was checking what's the difference in underlying code Will check it meanwhile
Just ping me later with the case of input plugin
👍 will check the behavior for the input plugin tomorrow and report back
another scope of Idea: track plugin which emit error event in router side, then emit more user friendly error message.
Thanks for your report. This is Fluentd's specification.
This dump an error event log is generated by RootAgent (Fluentd's common logic).
The filter_parser plugin just emits the error event to the router.
So, this log does not have its id.
https://github.com/fluent/fluentd/blob/e763c0761c44d9734b6aa374371387a2e8406522/lib/fluent/root_agent.rb#L344-L354
Although this is the current specification, it looks like not so good. Maybe it could be improved somehow.
Although this is the current specification, it looks like not so good. Maybe it could be improved somehow.
It would be better to include the id of the plugin that emits the error event.
It may be easy to manually include the id of filter_parser in the error info that it emits.
https://github.com/fluent/fluentd/blob/e763c0761c44d9734b6aa374371387a2e8406522/lib/fluent/plugin/filter_parser.rb#L101-L103
However, it is not so good because every plugin that emits error events needs to consider this issue.
As @kenhys says in https://github.com/fluent/fluentd/issues/4567#issuecomment-2259468182, it would be great if we could create a common mechanism for this.
another scope of Idea: track plugin which emit error event in router side, then emit more user friendly error message.
@daipom so basically we need to avoid change in function signature but capture the caller plugin id?
so basically we need to avoid change in function signature but capture the caller plugin id?
Yes.
We can't change the signature of emit_error_event() because it is related to Event Emitter plugin helper API.
We need to keep compatibility for the plugins that use this function.
Why don't we use log.error method to generate the error log than emitting it as event(It's indirect but works the same way right)?
@Athishpranav2003 to come back to your question, I do see the @id field for logs generated by the input plugin.
Why don't we use log.error method to generate the error log than emitting it as event(It's indirect but works the same way right)?
@daipom what about this?
Are there any updates on this?
Sorry for my late response.
Why don't we use log.error method to generate the error log than emitting it as event(It's indirect but works the same way right)?
Filter plugin should use emit_error_event().
It outputs events as @ERROR label.
It is intended to rescue collected data that could not be processed.
On the other hand, the events from Fluentd logger mean logs of Fluentd, not collected data by input plugins.
Fluentd logger outputs events as @FLUENT_LOG label.
Using both would be a possible solution. However, it would be better if it could be resolved through a common mechanism.
I made a PR for this issue.
- #4632
However, I'm concerned about some race conditions of the related existing logic...
I cannot say anything about potential race conditions, but I just tested the PR in my cluster and it seems to exactly achieve what I hoped for. Thanks @daipom! :) Would be great to have this in.
@DavidKnacker Thanks! I will check the race condition!
Hey @daipom, are there any updates on the ticket?
Sorry, we need to consider the race condition issue more. I wish we could get this into 1.18.0, but I am not sure...
I confirmed the race condition issue.
- https://github.com/fluent/fluentd/pull/4632#issuecomment-2503344686
We need to fix this race condition first.