beats icon indicating copy to clipboard operation
beats copied to clipboard

[Filebeat] AWS S3 input split JSON arrays into multiple events

Open legoguy1000 opened this issue 3 years ago • 10 comments

What does this PR do?

Mirror the feature of the Azure Event Hub input that if the message is a JSON array of objects, split it into multiple events

Why is it important?

Checklist

  • [X] My code follows the style guidelines of this project
  • [X] I have commented my code, particularly in hard-to-understand areas
  • [X] I have made corresponding changes to the documentation
  • [X] I have made corresponding change to the default configuration files
  • [X] I have added tests that prove my fix is effective or that my feature works
  • [X] I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Author's Checklist

  • [ ]

How to test this PR locally

Related issues

  • Relates #29951

Use cases

Screenshots

Logs

legoguy1000 avatar Jun 09 '22 23:06 legoguy1000

:grey_exclamation: Build Aborted

The PR is not allowed to run in the CI yet

the below badges are clickable and redirect to their specific view in the CI or DOCS Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Start Time: 2022-06-14T20:40:32.141+0000

  • Duration: 4 min 29 sec

Steps errors 1

Expand to view the steps failures

Error signal
  • Took 0 min 0 sec . View more details here
  • Description: githubApiCall: The REST API call https://api.github.com/orgs/elastic/members/legoguy1000 return the message : java.lang.Exception: httpRequest: Failure connecting to the service https://api.github.com/orgs/elastic/members/legoguy1000 : httpRequest: Failure connecting to the service https://api.github.com/orgs/elastic/members/legoguy1000 : Code: 404Error: {"message":"User does not exist or is not a member of the organization","documentation_url":"https://docs.github.com/rest/reference/orgs#check-organization-membership-for-a-user"}

:robot: GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

elasticmachine avatar Jun 09 '22 23:06 elasticmachine

@legoguy1000 , this would be a breaking change as it is :)

my personal opinion is that it should be a feature behind a config flag in order to enable it

@kaiyan-sheng , @andrewkroh what do you think?

paolafrancesca avatar Jun 10 '22 02:06 paolafrancesca

As I was adding the change log I had an inclination that it could be a breaking change. I don't really care about how it's implemented, I care about parity between inputs. Event hub splits arrays and expands a sub key, AWS only expands a sub key, file stream can do neither.... That is what drove my other related PR to create a split module, now a parser.

legoguy1000 avatar Jun 10 '22 02:06 legoguy1000

I do think an explicit configuration is needed to avoid a breaking change. How about requiring this to split an array of objects (inspired by jq):

expand_event_list_from_field: .[]

andrewkroh avatar Jun 10 '22 02:06 andrewkroh

I do think an explicit configuration is needed to avoid a breaking change

yes, as long as the behaviour does not apply automatically, it's not needed. your proposal with expand_event_list_from_field: .[] is fine for me

paolafrancesca avatar Jun 13 '22 01:06 paolafrancesca

Agree on the expand_event_list_from_field parameter! @legoguy1000 Thank you so much for adding this!

kaiyan-sheng avatar Jun 14 '22 18:06 kaiyan-sheng

Pinging @elastic/integrations (Team:Integrations)

elasticmachine avatar Jun 16 '22 14:06 elasticmachine

This pull request is now in conflicts. Could you fix it? 🙏 To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b awss3-json-array upstream/awss3-json-array
git merge upstream/main
git push upstream awss3-json-array

mergify[bot] avatar Jun 16 '22 21:06 mergify[bot]

This pull request does not have a backport label. If this is a bug or security fix, could you label this PR @legoguy1000? 🙏. For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change) To fixup this pull request, you need to add the backport labels for the needed branches, such as:
  • backport-v8./d.0 is the label to automatically backport to the 8./d branch. /d is the digit

mergify[bot] avatar Aug 16 '22 15:08 mergify[bot]

This pull request is now in conflicts. Could you fix it? 🙏 To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b awss3-json-array upstream/awss3-json-array
git merge upstream/main
git push upstream awss3-json-array

mergify[bot] avatar Sep 26 '22 06:09 mergify[bot]

This pull request is now in conflicts. Could you fix it? 🙏 To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b awss3-json-array upstream/awss3-json-array
git merge upstream/main
git push upstream awss3-json-array

mergify[bot] avatar Mar 27 '23 11:03 mergify[bot]

This pull request is now in conflicts. Could you fix it? 🙏 To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b awss3-json-array upstream/awss3-json-array
git merge upstream/main
git push upstream awss3-json-array

mergify[bot] avatar Jun 27 '23 03:06 mergify[bot]

Closing because https://github.com/elastic/beats/pull/35475 implemented this.

andrewkroh avatar Jun 27 '23 16:06 andrewkroh

This pull request is now in conflicts. Could you fix it? 🙏 To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b awss3-json-array upstream/awss3-json-array
git merge upstream/main
git push upstream awss3-json-array

mergify[bot] avatar Oct 09 '23 08:10 mergify[bot]

This pull request does not have a backport label. If this is a bug or security fix, could you label this PR @legoguy1000? 🙏. For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed branches, such as:

  • backport-v8./d.0 is the label to automatically backport to the 8./d branch. /d is the digit

mergify[bot] avatar Oct 09 '23 08:10 mergify[bot]

Closing according to @andrewkroh's comment.

jlind23 avatar Nov 29 '23 15:11 jlind23