log: add caller plugin-id about emitting error events
Which issue(s) this PR fixes:
- Fixes #4567
What this PR does / why we need it: Add caller plugin-id to warning logs about emitting error events.
It would be helpful if we could know what plugin emitted the error event.
We need to care about the compatibility. This signature change would not break compatibility.
However, I'm concerned that caller_plugin_id has a race condition, although I don't confirm it.
It looks to me that the id can be another plugin-id running concurrently...
It is not the issue with this fix, it is the issue of the existing implementation.
Docs Changes: Not needed.
Release Note: Add plugin-id to warning logs about emitting error events.
Example: From #4567.
<source>
@type sample
@id id_sample_not_json
sample {"data":"this is not json}"}
tag sample.json
</source>
<filter sample.json>
@type parser
@id id_not_json
key_name data
<parse>
@type json
</parse>
</filter>
<match sample.**>
@type stdout
</match>
2024-09-11 15:26:26 +0900 [info]: #0 fluentd worker is now running worker=0
2024-09-11 15:26:27 +0900 [warn]: #0 dump an error event:
error_class=Fluent::Plugin::Parser::ParserError
error="pattern not matched with data 'this is not json}'"
location=nil tag="sample.json"
time=2024-09-11 15:26:27.040657722 +0900
plugin_id="id_not_json" ### <= This info is added by this PR ###
record={"data"=>"this is not json}"}
...
However, I'm concerned that caller_plugin_id has a race condition, although I don't confirm it. It looks to me that the id can be another plugin-id running concurrently...
https://github.com/fluent/fluentd/blob/51b860b1be2eb59076d706fda12d55657a614c8f/lib/fluent/plugin_helper/event_emitter.rb#L27-L36
As error.backtrace does not have enough information in this context, so current approach seems reasonable. (No need to force 3rdparty plugins to fix it)
This PR has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this PR will be closed in 7 days
A simple check seems ok. But may occur with more load.
<source>
@type sample
tag test.foo
</source>
<source>
@type sample
tag test.bar
</source>
<source>
@type sample
tag test.boo
</source>
<filter test.foo>
@type parser
@id foo
key_name not-exist
<parse>
@type none
</parse>
</filter>
<filter test.bar>
@type parser
@id bar
key_name not-exist
<parse>
@type none
</parse>
</filter>
<filter test.boo>
@type parser
@id boo
key_name not-exist
<parse>
@type none
</parse>
</filter>
<match test.**>
@type stdout
</match>
2024-11-27 13:55:58 +0900 [warn]: #0 dump an error event: error_class=ArgumentError error="not-exist does not exist" location="/home/daipom/work/fluentd/fluentd/lib/fluent/plugin/filter_parser.rb:66:in `block in filter_stream'" tag="test.bar" time=2024-11-27 13:55:58.022326275 +0900 plugin_id="bar" record={"message"=>"sample"}
2024-11-27 13:55:58 +0900 [warn]: #0 dump an error event: error_class=ArgumentError error="not-exist does not exist" location="/home/daipom/work/fluentd/fluentd/lib/fluent/plugin/filter_parser.rb:66:in `block in filter_stream'" tag="test.foo" time=2024-11-27 13:55:58.023002806 +0900 plugin_id="foo" record={"message"=>"sample"}
2024-11-27 13:55:58 +0900 [warn]: #0 dump an error event: error_class=ArgumentError error="not-exist does not exist" location="/home/daipom/work/fluentd/fluentd/lib/fluent/plugin/filter_parser.rb:66:in `block in filter_stream'" tag="test.boo" time=2024-11-27 13:55:58.023570092 +0900 plugin_id="boo" record={"message"=>"sample"}
...
I've succeeded in reproducing the race condition.
- Run 3
in_sampleswithrate 100(100 emits/s) in concurrent. - Make each filter plugin raise an error.
- Check Fluentd log. Check for incorrect tag/id combinations.
Results: 0.005% mismatch confirmed.
<source>
@type sample
tag test.foo
rate 100
</source>
<source>
@type sample
tag test.bar
rate 100
</source>
<source>
@type sample
tag test.boo
rate 100
</source>
<filter test.foo>
@type parser
@id foo
key_name not-exist
<parse>
@type none
</parse>
</filter>
<filter test.bar>
@type parser
@id bar
key_name not-exist
<parse>
@type none
</parse>
</filter>
<filter test.boo>
@type parser
@id boo
key_name not-exist
<parse>
@type none
</parse>
</filter>
<match test.**>
@type stdout
</match>
Script for check (Ruby)
#!/usr/bin/env ruby
entire_count = 0
mismatched_count = 0
p "checking files ..."
ARGF.each_line do |line|
next unless /^.* dump an error event: .* tag="(.*)" .* plugin_id="(.*)" .*$/ =~ line
entire_count += 1
tag = $1
plugin_id = $2
unless tag.include? plugin_id
mismatched_count += 1
p "mismatched! tag: #{tag}, plugin_id: #{plugin_id}, raw_log: #{line}"
end
end
p "done"
p "entire_count: #{entire_count}"
p "mismatched_count: #{mismatched_count}"
p "mismatched_percentage: #{100.0 * mismatched_count / entire_count}%"
$ bundle exec fluentd -c test.conf --log log; ./check.rb < log; rm -f log
(Wait a moment and press Ctrl+C)
^C
"checking files ..."
"mismatched! tag: test.boo, plugin_id: bar, raw_log: 2024-11-27 18:14:22 +0900 [warn]: #0 dump an error event: error_class=ArgumentError error=\"not-exist does not exist\" location=\"/home/daipom/work/fluentd/fluentd/lib/fluent/plugin/filter_parser.rb:66:in `block in filter_stream'\" tag=\"test.boo\" time=2024-11-27 18:14:22.793045711 +0900 plugin_id=\"bar\" record={\"message\"=>\"sample\"}\n"
"mismatched! tag: test.bar, plugin_id: foo, raw_log: 2024-11-27 18:15:56 +0900 [warn]: #0 dump an error event: error_class=ArgumentError error=\"not-exist does not exist\" location=\"/home/daipom/work/fluentd/fluentd/lib/fluent/plugin/filter_parser.rb:66:in `block in filter_stream'\" tag=\"test.bar\" time=2024-11-27 18:15:56.735563552 +0900 plugin_id=\"foo\" record={\"message\"=>\"sample\"}\n"
"mismatched! tag: test.bar, plugin_id: foo, raw_log: 2024-11-27 18:17:02 +0900 [warn]: #0 dump an error event: error_class=ArgumentError error=\"not-exist does not exist\" location=\"/home/daipom/work/fluentd/fluentd/lib/fluent/plugin/filter_parser.rb:66:in `block in filter_stream'\" tag=\"test.bar\" time=2024-11-27 18:17:02.429503164 +0900 plugin_id=\"foo\" record={\"message\"=>\"sample\"}\n"
"mismatched! tag: test.bar, plugin_id: foo, raw_log: 2024-11-27 18:17:24 +0900 [warn]: #0 dump an error event: error_class=ArgumentError error=\"not-exist does not exist\" location=\"/home/daipom/work/fluentd/fluentd/lib/fluent/plugin/filter_parser.rb:66:in `block in filter_stream'\" tag=\"test.bar\" time=2024-11-27 18:17:24.963402900 +0900 plugin_id=\"foo\" record={\"message\"=>\"sample\"}\n"
"done"
"entire_count: 73260"
"mismatched_count: 4"
"mismatched_percentage: 0.00546000546000546%"
However, I'm concerned that
caller_plugin_idhas a race condition, although I don't confirm it. It looks to me that the id can be another plugin-id running concurrently... It is not the issue with this fix, it is the issue of the existing implementation.
This issue was confirmed. We need to fix this race condition first. So, I close this.