beats
beats copied to clipboard
`filestream` input logs an error when an existing input is reloaded with the same ID
When loading inputs from external files is enabled and there is only one filestream input enabled the following error is returned after the first reload:
filestream input with ID 'my-unique-id' already exists, this will lead to data duplication, please use a different ID
The problem is that the input ID stays in the bookeeper of the input manager. On reload we should remove all IDs from the manager, so the check does not interfere with previous configurations.
Example configuration
filebeat.yml
filebeat.config.inputs:
enabled: true
path: inputs.d/*.yml
output.file:
enabled: true
inputs.d/my-input.yml
- type: filestream
id: my-unique-id
paths: [/var/log/messages]
There is no workaround. At the moment it is not possible to configure filestream input from external configuration files.
Reported on discuss: https://discuss.elastic.co/t/filebeat-raise-error-filestream-input-id-already-exist
CC @belimawr
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)
That is fixed by: https://github.com/elastic/beats/pull/32309
I see that the fix has been backported to 7.17. Are there plans already for the next release date?
There is a planned 7.17.6 release in the near future.
According to the release notes, the fix is included in 7.17.6 but I still see the error happening with a single filestream input in a file in the inputs.d directory.
According to the release notes, the fix is included in 7.17.6 but I still see the error happening with a single filestream input in a file in the inputs.d directory.
Hi @abraxxa, could you share some more details about how you're experiencing this issue? A minimal config and steps to reproduce would be ideal.
I've just checked the v7.17.6 tag and the fix is there. If you can help me to reproduce it, I'll investigate it further.
Hi @belimawr, thanks for your response!
Exactly as reported above.
I've started yesterday changing type: log to filestream and it happened on the very first server where I only have a single file in the input.d directory which collects logs from three files (no globs involved).
Filebeat is version 7.17.6 and the VM runs on Debian 10.
Sorry if I'm asking something obvious, but are you sure you didn't have any filestream input configured on your filebeat.yml?
Did you touch/edit the file(s) on inputs.d/?
I'm just trying to understand what triggered it. Starting a Filebeat with a single filestream input on inputs.d without any other filestream configured on filebeat.yml should never trigger this message.
filebeat.yml is untouched from the deb package. It has one disabled input of type: filestream for /var/log/*.log.
But as the error message included the id I assigned to the filestream input in the input.d directory, I didn't check it before.
That's the exact error message:
Sep 05 17:40:16 tsa-tc-dot1x-3 filebeat[23435]: 2022-09-05T17:40:16.552+0200 ERROR [input] input-logfile/manager.go:183 filestream input with ID 'dot1x-services' already exists, this will lead to data duplication, please use a different ID
And here is the file in the input.d directory:
---
- type: filestream
id: dot1x-services
paths:
- /var/log/radiator-dot1x/central-logging-radiator.json
- /var/log/radiator-dot1x/central-logging-authentication.json
- /var/log/radiator-dot1x/central-logging-accounting.json
encoding: utf-8
parsers:
- ndjson:
keys_under_root: true
overwrite_keys: true
expand_keys: true
prospector.scanner.check_interval: 1s
processors:
- add_fields:
target: "data_stream"
fields:
dataset: "radiator"
type: "logs"
- add_fields:
target: "event"
fields:
module: "radiator-dot1x"
Thanks! I'll investigate it.
Ok, I can confirm that this is still happening on v7.16.6 as well as on `main.
Great, thanks! One more info: it happened when first restarting filebeat with the input converted from log to filestream as well as on reload and restart now.
yes, for some reason Filebeat is reading the config twice on startup, I believe it reads the inputs.d as the normal startup sequence, then starts watching inputs.d which trigger a re-read of the files, hence a loading again the same filestream input. But it does not stop the current one before starting the new one.
The problem
I believe I got to the root cause:
- The normal startup process loads all inputs, including the ones on
inputs.d - Because things happens concurrently, the input is loaded (it's ID is added to the inputManager's map), but the harvester is not started yet.
- The inputReloader is started in another goroutine
- The inputReloaders reads again
inputs.dand starts the inputs there. - Because the harvester has not started yet (and the other things that lead to the harvester running), the inputRelaoder does not know it needs to stop an input, so it only starts (again) the filestream input
Why does it not lead to two inputs running at the same time?
Well, the harvester is smart enough not to run two harvesters for the same file from the same input. So only one of the inputs actually run. This leads to no issues while running Filebeat.
Workaround
There are two possible workarounds, both with the downside of not having the reload of inputs happening automatically.
Only use filebeat.yml
One easy way is to add all your input configurations on filebeat.yml and completely ignore the inputs.d directory
~~## Disable the reload of inputs~~ After more testing we found out that disabling the reload of inputs does not solve the problem.
Some extra information
Here are the stacktraces of when the filestream input is first loaded by the input manager and the second time it's loaded by the input manager (via the reloader), which triggers the log message:
First time, it comes from the "normal" startup that started on main.go
(dlv) bt
0 0x00000000028c3bc0 in github.com/elastic/beats/v7/filebeat/input/filestream/internal/input-logfile.(*InputManager).Create
at ./input/filestream/internal/input-logfile/manager.go:162
1 0x00000000028aed86 in github.com/elastic/beats/v7/filebeat/input/v2.(*Loader).Configure
at ./input/v2/loader.go:110
2 0x00000000028afda5 in github.com/elastic/beats/v7/filebeat/input/v2/compat.(*factory).CheckConfig
at ./input/v2/compat/compat.go:73
3 0x00000000028b0927 in github.com/elastic/beats/v7/filebeat/input/v2/compat.composeFactory.CheckConfig
at ./input/v2/compat/composed.go:48
4 0x00000000028b0c25 in github.com/elastic/beats/v7/filebeat/input/v2/compat.(*composeFactory).CheckConfig
at <autogenerated>:1
5 0x000000000281fcc2 in github.com/elastic/beats/v7/filebeat/channel.(*onCreateFactory).CheckConfig
at ./channel/runner.go:64
6 0x00000000016dc967 in github.com/elastic/beats/v7/libbeat/cfgfile.(*Reloader).Check
at /home/tiago/devel/beats/libbeat/cfgfile/reload.go:155
7 0x00000000028b6655 in github.com/elastic/beats/v7/filebeat/beater.(*crawler).Start
at ./beater/crawler.go:83
8 0x00000000028b9431 in github.com/elastic/beats/v7/filebeat/beater.(*Filebeat).Run
at ./beater/filebeat.go:338
9 0x00000000027fb7d3 in github.com/elastic/beats/v7/libbeat/cmd/instance.(*Beat).launch
at /home/tiago/devel/beats/libbeat/cmd/instance/beat.go:475
10 0x00000000027f9c05 in github.com/elastic/beats/v7/libbeat/cmd/instance.Run.func1
at /home/tiago/devel/beats/libbeat/cmd/instance/beat.go:180
11 0x00000000027f9a85 in github.com/elastic/beats/v7/libbeat/cmd/instance.Run
at /home/tiago/devel/beats/libbeat/cmd/instance/beat.go:181
12 0x000000000280af58 in github.com/elastic/beats/v7/libbeat/cmd.genRunCmd.func1
at /home/tiago/devel/beats/libbeat/cmd/run.go:36
13 0x00000000011a52c3 in github.com/spf13/cobra.(*Command).execute
at /home/tiago/go/pkg/mod/github.com/spf13/[email protected]/command.go:860
14 0x00000000011a59dd in github.com/spf13/cobra.(*Command).ExecuteC
at /home/tiago/go/pkg/mod/github.com/spf13/[email protected]/command.go:974
Sending output to pager...
0 0x00000000028c3bc0 in github.com/elastic/beats/v7/filebeat/input/filestream/internal/input-logfile.(*InputManager).Create
at ./input/filestream/internal/input-logfile/manager.go:162
1 0x00000000028aed86 in github.com/elastic/beats/v7/filebeat/input/v2.(*Loader).Configure
at ./input/v2/loader.go:110
2 0x00000000028afda5 in github.com/elastic/beats/v7/filebeat/input/v2/compat.(*factory).CheckConfig
at ./input/v2/compat/compat.go:73
3 0x00000000028b0927 in github.com/elastic/beats/v7/filebeat/input/v2/compat.composeFactory.CheckConfig
at ./input/v2/compat/composed.go:48
4 0x00000000028b0c25 in github.com/elastic/beats/v7/filebeat/input/v2/compat.(*composeFactory).CheckConfig
at <autogenerated>:1
5 0x000000000281fcc2 in github.com/elastic/beats/v7/filebeat/channel.(*onCreateFactory).CheckConfig
at ./channel/runner.go:64
6 0x00000000016dc967 in github.com/elastic/beats/v7/libbeat/cfgfile.(*Reloader).Check
at /home/tiago/devel/beats/libbeat/cfgfile/reload.go:155
7 0x00000000028b6655 in github.com/elastic/beats/v7/filebeat/beater.(*crawler).Start
at ./beater/crawler.go:83
8 0x00000000028b9431 in github.com/elastic/beats/v7/filebeat/beater.(*Filebeat).Run
at ./beater/filebeat.go:338
9 0x00000000027fb7d3 in github.com/elastic/beats/v7/libbeat/cmd/instance.(*Beat).launch
at /home/tiago/devel/beats/libbeat/cmd/instance/beat.go:475
10 0x00000000027f9c05 in github.com/elastic/beats/v7/libbeat/cmd/instance.Run.func1
at /home/tiago/devel/beats/libbeat/cmd/instance/beat.go:180
11 0x00000000027f9a85 in github.com/elastic/beats/v7/libbeat/cmd/instance.Run
at /home/tiago/devel/beats/libbeat/cmd/instance/beat.go:181
12 0x000000000280af58 in github.com/elastic/beats/v7/libbeat/cmd.genRunCmd.func1
at /home/tiago/devel/beats/libbeat/cmd/run.go:36
13 0x00000000011a52c3 in github.com/spf13/cobra.(*Command).execute
at /home/tiago/go/pkg/mod/github.com/spf13/[email protected]/command.go:860
14 0x00000000011a59dd in github.com/spf13/cobra.(*Command).ExecuteC
at /home/tiago/go/pkg/mod/github.com/spf13/[email protected]/command.go:974
15 0x00000000028e13c6 in github.com/spf13/cobra.(*Command).Execute
15 0x00000000028e13c6 in github.com/spf13/cobra.(*Command).Execute
at /home/tiago/go/pkg/mod/github.com/spf13/[email protected]/command.go:902
16 0x00000000028e13c6 in main.main
at ./main.go:36
17 0x0000000000ff5db2 in runtime.main
at /usr/local/go/src/runtime/proc.go:250
18 0x0000000001028d81 in runtime.goexit
at /usr/local/go/src/runtime/asm_amd64.s:1594
Second time, the harvester is not running, but the input manager already has got the ID registered. It runs on a different goroutine
(dlv) bt
0 0x00000000028c3bc0 in github.com/elastic/beats/v7/filebeat/input/filestream/internal/input-logfile.(*InputManager).Create
at ./input/filestream/internal/input-logfile/manager.go:162
1 0x00000000028aed86 in github.com/elastic/beats/v7/filebeat/input/v2.(*Loader).Configure
at ./input/v2/loader.go:110
2 0x00000000028afe5a in github.com/elastic/beats/v7/filebeat/input/v2/compat.(*factory).Create
at ./input/v2/compat/compat.go:84
3 0x00000000028b0a22 in github.com/elastic/beats/v7/filebeat/input/v2/compat.composeFactory.Create
at ./input/v2/compat/composed.go:62
4 0x00000000028b0cd5 in github.com/elastic/beats/v7/filebeat/input/v2/compat.(*composeFactory).Create
at <autogenerated>:1
5 0x000000000281ffa9 in github.com/elastic/beats/v7/filebeat/channel.RunnerFactoryWithCommonInputSettings.func1
at ./channel/runner.go:99
6 0x000000000281fd3e in github.com/elastic/beats/v7/filebeat/channel.(*onCreateFactory).Create
at ./channel/runner.go:68
7 0x00000000016dc418 in github.com/elastic/beats/v7/libbeat/cfgfile.createRunner
at /home/tiago/devel/beats/libbeat/cfgfile/list.go:188
8 0x00000000016db15f in github.com/elastic/beats/v7/libbeat/cfgfile.(*RunnerList).Reload
at /home/tiago/devel/beats/libbeat/cfgfile/list.go:103
9 0x00000000016dcd35 in github.com/elastic/beats/v7/libbeat/cfgfile.(*Reloader).Run
at /home/tiago/devel/beats/libbeat/cfgfile/reload.go:215
10 0x00000000028b6949 in github.com/elastic/beats/v7/filebeat/beater.(*crawler).Start.func1
at ./beater/crawler.go:97
11 0x0000000001028d81 in runtime.goexit
at /usr/local/go/src/runtime/asm_amd64.s:1594
(dlv)
@pierrehilbert @cmacknz we should look into the priority of this, when using the inputs.d it emits misleading errors about duplicated filestream IDs. Aside that, it seems to have no effect onto Filebeat's execution.
Thanks for debugging this so quickly! 👍🏻
Can inputReloader be disabled?
Can
inputReloaderbe disabled?
Yes, you can add:
filebeat.config:
inputs:
enabled: false
path: inputs.d/*.yml
on your filebeat.yml
This option is enabled by default, hence it's not explicitly listed on filebeat.yml (/etc/filebeat/filebeat.yml on the deb package)
Or just not add anything into inputs.d folder and keep all your inputs into filebeat.yml
But then the file in the inputs.d directory won't be loaded. My goal was to load it but not discover any changes to it as I handle deployment and restarting/reloading with Ansible.
yes, that's the downside of the workaround, you'd have to add all input configuration to filebeat.yml.
You can add your input configurations on any part of filebeat.yml, so you could add it at the very end, e.g:
filebeat.config.inputs:
enabled: false
path: inputs.d/*.yml
reload.period: 60s
output.file:
enabled: true
path: "/tmp/filebeat-out"
filebeat.inputs:
- type: filestream
id: my-unique-id
paths: [/tmp/flog.log]
That might make it a bit easier to manipulatethis file with Ansible (e.g: concatenate a default config file with the input configuration) . If you don't need to watch the inputs.d directory for changes, that might be an option for you.
I just read up on this feature in the docs. What if I disable the input.d file reloading using reload.enabled: false?
Sadly the docs don't include its default value.
I just read up on this feature in the docs. What if I disable the input.d file reloading using
reload.enabled: false? Sadly the docs don't include its default value.
I did a quick test here and it woks :D. I'll update my other comment with the workaround.
I totally missed that option :man_facepalming: . Thanks for finding it!
For me it doesn't fix the issue, even when starting filebeat the error is logged.
Indeed it's still happening :/
I've just tested it again and it still happening. Maybe I got lucky with the order the goroutines were running the first time I tested it.
Well, the best solution seem to be to just avoid using the inputs.d directory.
Solution or workaround?
Solution or workaround?
workaround.
Can you please give me an update on this issue? Thanks!
Can you please give me an update on this issue? Thanks!
At the moment I have no updates to give :/
Can I do something to raise the priority? We have a support contract.
Can I do something to raise the priority? We have a support contract.
Yes, if you have a support contract, please reach out to support and mention this GitHub issue.