logstash Backport PR #16482 to 8.16: Bugfix for BufferedTokenizer to completely consume lines in case of lines bigger then sizeLimit

Backport PR #16482 to 8.16: Bugfix for BufferedTokenizer to completely consume lines in case of lines bigger then sizeLimit

Open github-actions[bot] opened this issue 4 months ago • 2 comments

Backport PR #16482 to 8.16 branch, original message:

Release notes

[rn:skip]

What does this PR do?

Updates BufferedTokenizerExt so that can accumulate token fragments coming from different data segments. When a "buffer full" condition is matched, it record this state in a local field so that on next data segment it can consume all the token fragments till the next token delimiter. Updated the accumulation variable from RubyArray containing strings to a StringBuilder which contains the head token, plus the remaining token fragments are stored in the input array. Port the tests present at https://github.com/elastic/logstash/blob/f35e10d79251b4ce3a5a0aa0fbb43c2e96205ba1/logstash-core/spec/logstash/util/buftok_spec.rb#L20 in Java.

Why is it important/What is the impact to the user?

Fixes the behaviour of the tokenizer to be able to work properly when buffer full conditions are met.

Checklist

[x] My code follows the style guidelines of this project
[x] I have commented my code, particularly in hard-to-understand areas
~~[ ] I have made corresponding changes to the documentation~~
~~[ ] I have made corresponding change to the default configuration files (and/or docker env variables)~~
[x] I have added tests that prove my fix is effective or that my feature works

Author's Checklist

[x] test as described in #16483

How to test this PR locally

Follow the instructions in #16483

Related issues

Closes #16483

Use cases

Screenshots

Logs

Oct 17 '24 11:10 github-actions[bot]

logstash logstash copied to clipboard

Backport PR #16482 to 8.16: Bugfix for BufferedTokenizer to completely consume lines in case of lines bigger then sizeLimit

Release notes

What does this PR do?

Why is it important/What is the impact to the user?

Checklist

Author's Checklist

How to test this PR locally

Related issues

Use cases

Screenshots

Logs

logstash
logstash copied to clipboard