Duplicate events get ingested in winevtlog input plugin for fluent-bit 3.0.2.
Bug Report
Describe the bug
- Duplicate events get ingested in following case:
- When multiple channels are configured in the fluent-bit and
- An event_query containing multiple channels mentioned in above list is provided.
To Reproduce
- Configure an winevtlog input plugin.
- Make sure to have multiple channels configured in the config.
- Also, provide the value of an input query in which multiple channels are used for filtering the events.
- For example:
- name: winevtlog
tag: some-tag
alias: WIndows alias
storage.type: filesystem
channels: application,security,system
interval_sec: 5
read_existing_events: true
db: C:\Program Files\checkpoint.db
render_event_as_xml: true
read_limit_per_cycle: 2m
event_query: <QueryList><Query Id="0" Path="Application"><Select Path="Application">*[System[TimeCreated[@SystemTime>='2024-04-22T07:30:22.000Z' and @SystemTime<='2024-04-22T09:30:22.999Z']]]</Select><Select Path="Security">*[System[TimeCreated[@SystemTime>='2024-04-22T07:30:22.000Z' and @SystemTime<='2024-04-22T09:30:22.999Z']]]</Select><Select Path="System">*[System[TimeCreated[@SystemTime>='2024-04-22T07:30:22.000Z' and @SystemTime<='2024-04-22T09:30:22.999Z']]]</Select></Query></QueryList>
Expected behavior
- There should be no duplication of events.
Screenshots
-
Upon querying the sqlite DB it was observed that each row of channel configured in the input plugin the bookmark_xml maintains the offset of all channels provided in the event_query as well.
-
3 channels in both
channelsandevent_query: -
2 channels in both
channelsandevent_query. -
2 channels in
channelsand an emptyevent_query
Your Environment
- Version used: 3.0.2
- Operating System and version: Windows server 2022
- Filters and plugins: lua, modify
I reproduce your issue and I found a workaround for this case:
pipeline:
inputs:
- name: winevtlog
tag: some-tag
alias: WIndows alias
channels: application
interval_sec: 5
read_existing_events: true
db: .\checkpoint.db
render_event_as_xml: true
read_limit_per_cycle: 2m
event_query: |
<QueryList>
<Query Id="0" Path="Application">
<Select Path="Application">*[System[TimeCreated[@SystemTime>='2024-04-22T07:30:22.000Z' and @SystemTime<='2024-04-22T09:30:22.999Z']]]</Select>
</Query>
</QueryList>
- name: winevtlog
tag: some-tag
alias: WIndows alias
channels: security
interval_sec: 5
read_existing_events: true
db: .\checkpoint.db
render_event_as_xml: true
read_limit_per_cycle: 2m
event_query: |
<QueryList>
<Query Id="0" Path="Application">
<Select Path="Security">*[System[TimeCreated[@SystemTime>='2024-04-22T07:30:22.000Z' and @SystemTime<='2024-04-22T09:30:22.999Z']]]</Select>
</Query>
</QueryList>
- name: winevtlog
tag: some-tag
alias: WIndows alias
channels: system
interval_sec: 5
read_existing_events: true
db: .\checkpoint.db
render_event_as_xml: true
read_limit_per_cycle: 2m
event_query: |
<QueryList>
<Query Id="0" Path="Application">
<Select Path="System">*[System[TimeCreated[@SystemTime>='2024-04-22T07:30:22.000Z' and @SystemTime<='2024-04-22T09:30:22.999Z']]]</Select>
</Query>
</QueryList>
Meanwhile it needs to define the event_query per channels. This is because the bookmark will be forcibly restored the information which needs to subscribe channels. This shouldn't be expected behavior. So, defining one-by-one style shouldn't mixed up the conditions which should filter and collect Windows EventLogs.
If the channels are accepting multiple inputs, fluent bit should ideally have each stanza for query per channel in the configuration file. Is that correct understanding?
Above workaround might work but users who are already using this would be already facing this issue. Should this workaround be documented until fixed?
If the
channelsare accepting multiple inputs, fluent bit should ideally have each stanza for query per channel in the configuration file. Is that correct understanding?
Ideally, it's correct. However, Fluent Bit does not have the capability for now.
Above workaround might work but users who are already using this would be already facing this issue. Should this workaround be documented until fixed?
TBH, I have never heard that struggling things because QueryList with XML representation should be difficult to put in Fluent Bit configurations. Many of users should use easier configurations than yours.