fluent-bit icon indicating copy to clipboard operation
fluent-bit copied to clipboard

Duplicate events get ingested in winevtlog input plugin for fluent-bit 3.0.2.

Open Hardik-Parikh opened this issue 1 year ago • 3 comments

Bug Report

Describe the bug

  • Duplicate events get ingested in following case:
    • When multiple channels are configured in the fluent-bit and
    • An event_query containing multiple channels mentioned in above list is provided.

To Reproduce

  1. Configure an winevtlog input plugin.
  2. Make sure to have multiple channels configured in the config.
  3. Also, provide the value of an input query in which multiple channels are used for filtering the events.
  4. For example:
  - name: winevtlog
    tag: some-tag
    alias: WIndows alias
    storage.type: filesystem
    channels: application,security,system
    interval_sec: 5
    read_existing_events: true
    db: C:\Program Files\checkpoint.db
    render_event_as_xml: true
    read_limit_per_cycle: 2m
    event_query: <QueryList><Query Id="0" Path="Application"><Select Path="Application">*[System[TimeCreated[@SystemTime&gt;='2024-04-22T07:30:22.000Z' and @SystemTime&lt;='2024-04-22T09:30:22.999Z']]]</Select><Select Path="Security">*[System[TimeCreated[@SystemTime&gt;='2024-04-22T07:30:22.000Z' and @SystemTime&lt;='2024-04-22T09:30:22.999Z']]]</Select><Select Path="System">*[System[TimeCreated[@SystemTime&gt;='2024-04-22T07:30:22.000Z' and @SystemTime&lt;='2024-04-22T09:30:22.999Z']]]</Select></Query></QueryList>

Expected behavior

  • There should be no duplication of events.

Screenshots

  • Upon querying the sqlite DB it was observed that each row of channel configured in the input plugin the bookmark_xml maintains the offset of all channels provided in the event_query as well.

  • 3 channels in both channels and event_query: 3 channels in config and event query

  • 2 channels in both channels and event_query. 2 channels in config and event_query

  • 2 channels in channels and an empty event_query 2 channels in config only

Your Environment

  • Version used: 3.0.2
  • Operating System and version: Windows server 2022
  • Filters and plugins: lua, modify

Hardik-Parikh avatar Apr 22 '24 13:04 Hardik-Parikh

I reproduce your issue and I found a workaround for this case:

pipeline:
  inputs:
    - name: winevtlog
      tag: some-tag
      alias: WIndows alias
      channels: application
      interval_sec: 5
      read_existing_events: true
      db: .\checkpoint.db
      render_event_as_xml: true
      read_limit_per_cycle: 2m
      event_query: |
        <QueryList>
          <Query Id="0" Path="Application">
            <Select Path="Application">*[System[TimeCreated[@SystemTime&gt;='2024-04-22T07:30:22.000Z' and @SystemTime&lt;='2024-04-22T09:30:22.999Z']]]</Select>
          </Query>
        </QueryList>

    - name: winevtlog
      tag: some-tag
      alias: WIndows alias
      channels: security
      interval_sec: 5
      read_existing_events: true
      db: .\checkpoint.db
      render_event_as_xml: true
      read_limit_per_cycle: 2m
      event_query: |
        <QueryList>
          <Query Id="0" Path="Application">
            <Select Path="Security">*[System[TimeCreated[@SystemTime&gt;='2024-04-22T07:30:22.000Z' and @SystemTime&lt;='2024-04-22T09:30:22.999Z']]]</Select>
          </Query>
        </QueryList>

    - name: winevtlog
      tag: some-tag
      alias: WIndows alias
      channels: system
      interval_sec: 5
      read_existing_events: true
      db: .\checkpoint.db
      render_event_as_xml: true
      read_limit_per_cycle: 2m
      event_query: |
        <QueryList>
          <Query Id="0" Path="Application">
            <Select Path="System">*[System[TimeCreated[@SystemTime&gt;='2024-04-22T07:30:22.000Z' and @SystemTime&lt;='2024-04-22T09:30:22.999Z']]]</Select>
          </Query>
        </QueryList>

Meanwhile it needs to define the event_query per channels. This is because the bookmark will be forcibly restored the information which needs to subscribe channels. This shouldn't be expected behavior. So, defining one-by-one style shouldn't mixed up the conditions which should filter and collect Windows EventLogs.

cosmo0920 avatar Apr 23 '24 06:04 cosmo0920

If the channels are accepting multiple inputs, fluent bit should ideally have each stanza for query per channel in the configuration file. Is that correct understanding?

Above workaround might work but users who are already using this would be already facing this issue. Should this workaround be documented until fixed?

harshnasitcrest avatar Apr 23 '24 07:04 harshnasitcrest

If the channels are accepting multiple inputs, fluent bit should ideally have each stanza for query per channel in the configuration file. Is that correct understanding?

Ideally, it's correct. However, Fluent Bit does not have the capability for now.

Above workaround might work but users who are already using this would be already facing this issue. Should this workaround be documented until fixed?

TBH, I have never heard that struggling things because QueryList with XML representation should be difficult to put in Fluent Bit configurations. Many of users should use easier configurations than yours.

cosmo0920 avatar Apr 23 '24 08:04 cosmo0920