logstash
logstash copied to clipboard
Fix/avoid oom accumulation in bufftok
TODO to be applied on top of #17229
Release notes
What does this PR do?
Why is it important/What is the impact to the user?
Checklist
- [ ] My code follows the style guidelines of this project
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [ ] I have made corresponding change to the default configuration files (and/or docker env variables)
- [ ] I have added tests that prove my fix is effective or that my feature works
Author's Checklist
- [ ]
How to test this PR locally
1 Test on stdin
Test running the same configuration as in https://github.com/logstash-plugins/logstash-codec-json_lines/pull/43 against this branch, it shouldn't go in OOM.
2 Test on TCP
Use the 1Gb big one line file generated at step 1, with same heap limitation. Run Logstash with following pipeline:
input {
tcp {
port => 1234
codec => json_lines {
decode_size_limit_bytes => 32768
}
}
}
output {
stdout {
codec => rubydebug
}
}
and feed data in with netcat, like in:
cat /path/to/big_single_line.json | netcat localhost 1234
3 Test with file input
Use the 1Gb big one line file generated at step 1, with same heap limitation. Run Logstash with following pipeline:
input {
file {
path => "/path/to/big_single_line.json"
sincedb_path => "/tmp/sincedb"
mode => "read"
file_completed_action => "log"
file_completed_log_path => "/tmp/processed.log"
codec => json_lines {
decode_size_limit_bytes => 32768
}
}
}
output {
stdout {
codec => rubydebug
}
}
Related issues
- Closes #17275
Logs
Before this PR, but with branch in #17229
[2025-03-11T14:57:28,792][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid18156.hprof ...
Heap dump file created [450806605 bytes in 0.253 secs]
[2025-03-11T14:59:11,919][FATAL][org.logstash.Logstash ][main][59381166257a19a15b5377e6f323f1a71da7efdffecf2017efc6813660a5656e] uncaught error (in thread [main]<stdin)
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3541) ~[?:?]
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:242) ~[?:?]
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:587) ~[?:?]
at java.lang.StringBuilder.append(StringBuilder.java:179) ~[?:?]
at org.logstash.common.BufferedTokenizer$DataSplitter.append(BufferedTokenizer.java:102) ~[logstash-core.jar:?]
at org.logstash.common.BufferedTokenizer.extract(BufferedTokenizer.java:135) ~[logstash-core.jar:?]
at org.logstash.common.BufferedTokenizerExt.extract(BufferedTokenizerExt.java:83) ~[logstash-core.jar:?]
at java.lang.invoke.LambdaForm$DMH/0x00000008007dd400.invokeVirtual(LambdaForm$DMH) ~[?:?]
at java.lang.invoke.LambdaForm$MH/0x00000008007f5000.invoke(LambdaForm$MH) ~[?:?]
at java.lang.invoke.DelegatingMethodHandle$Holder.delegate(DelegatingMethodHandle$Holder) ~[?:?]
at java.lang.invoke.LambdaForm$MH/0x00000008007e4800.guard(LambdaForm$MH) ~[?:?]
at java.lang.invoke.DelegatingMethodHandle$Holder.delegate(DelegatingMethodHandle$Holder) ~[?:?]
at java.lang.invoke.LambdaForm$MH/0x00000008007e4800.guard(LambdaForm$MH) ~[?:?]
at java.lang.invoke.Invokers$Holder.linkToCallSite(Invokers$Holder) ~[?:?]
at Users.andrea.workspace.logstash_plugins.logstash_minus_codec_minus_json_lines.lib.logstash.codecs.json_lines.RUBY$method$decode$0(/Users/andrea/workspace/logstash_plugins/logstash-codec-json_lines/lib/logstash/codecs/json_lines.rb:69) ~[?:?]
at java.lang.invoke.DirectMethodHandle$Holder.invokeStatic(DirectMethodHandle$Holder) ~[?:?]
at java.lang.invoke.LambdaForm$MH/0x0000000800bda800.invoke(LambdaForm$MH) ~[?:?]
at java.lang.invoke.LambdaForm$MH/0x000000080084b400.invoke(LambdaForm$MH) ~[?:?]
at java.lang.invoke.DelegatingMethodHandle$Holder.delegate(DelegatingMethodHandle$Holder) ~[?:?]
at java.lang.invoke.LambdaForm$MH/0x00000008007e8000.guard(LambdaForm$MH) ~[?:?]
at java.lang.invoke.DelegatingMethodHandle$Holder.delegate(DelegatingMethodHandle$Holder) ~[?:?]
at java.lang.invoke.LambdaForm$MH/0x00000008007e8000.guard(LambdaForm$MH) ~[?:?]
at java.lang.invoke.Invokers$Holder.linkToCallSite(Invokers$Holder) ~[?:?]
at Users.andrea.workspace.logstash_andsel.vendor.bundle.jruby.$3_dot_1_dot_0.gems.logstash_minus_input_minus_stdin_minus_3_dot_4_dot_0.lib.logstash.inputs.stdin.RUBY$method$process$0(/Users/andrea/workspace/logstash_andsel/vendor/bundle/jruby/3.1.0/gems/logstash-input-stdin-3.4.0/lib/logstash/inputs/stdin.rb:61) ~[?:?]
at java.lang.invoke.LambdaForm$DMH/0x0000000800fc1000.invokeStatic(LambdaForm$DMH) ~[?:?]
at java.lang.invoke.LambdaForm$MH/0x00000008008dd000.invokeExact_MT(LambdaForm$MH) ~[?:?]
at org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:178) ~[jruby.jar:?]
at org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:222) ~[jruby.jar:?]
at org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:228) ~[jruby.jar:?]
at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:291) ~[jruby.jar:?]
at org.jruby.ir.interpreter.InterpreterEngine.processCall(InterpreterEngine.java:324) ~[jruby.jar:?]
at org.jruby.ir.interpreter.StartupInterpreterEngine.interpret(StartupInterpreterEngine.java:66) ~[jruby.jar:?]
With this fix:
[2025-03-11T15:05:22,081][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
{
"message" => "Payload bigger than 32768 bytes",
"@timestamp" => 2025-03-11T14:05:22.944113Z,
"host" => {
"hostname" => "andreas-MBP-2.station"
},
"event" => {
"original" => "orn\",\"surname\":\"Borg\"},{\"name\":\"John\",\"surname\":\"McEnroe\"},{\"name\":\"Jimmy\",\"surname\":\"Connors\"}],\"field_2621593\": [{\"name\":\"Jannik\",\"surname\":\"Sinner\"},{\"name\":\"Novak\",\"surname\":\"Djokovic\"},{\"name\":\"Rafa\",\"surname\":\"Nadal\"},{\"name\":\"Roger\",\"surname\":\"Federer\"},{\"name\":\"Pete\",\"surname\":\"Sampras\"},{\"name\":\"Andr\xC3\xA9\",\"surname\":\"Agassi\"},{\"name\":\"Rod\",\"surname\":\"Laver\"},{\"name\":\"Ivan\",\"surname\":\"Lendl\"},{\"name\":\"Bjorn\",\"surname\":\"Borg\"},{\"name\":\"John\",\"surname\":\"McEnroe\"},{\"name\":\"Jimmy\",\"surname\":\"Connors\"}]}\r\n"
},
"tags" => [
[0] "_jsonparsetoobigfailure"
],
"@version" => "1"
}
[2025-03-11T15:05:23,115][INFO ][logstash.javapipeline ][main] Pipeline terminated {"pipeline.id"=>"main"}
[2025-03-11T15:05:23,599][INFO ][logstash.pipelinesregistry] Removed pipeline from registry successfully {:pipeline_id=>:main}
[2025-03-11T15:05:23,609][INFO ][logstash.runner ] Logstash shut down.
This pull request does not have a backport label. Could you fix it @andsel? 🙏 To fixup this pull request, you need to add the backport labels for the needed branches, such as:
backport-8./dis the label to automatically backport to the8./dbranch./dis the digit.backport-8.xis the label to automatically backport to the8.xbranch.- If no backport is necessary, please add the
backport-skiplabel
Quality Gate passed
Issues
0 New issues
0 Fixed issues
0 Accepted issues
Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code
:yellow_heart: Build succeeded, but was flaky
- Buildkite Build
- Commit: 71daf70a05e1e1ddaf13a90d4ef7898f75c923ec
Failed CI Steps
History
- :green_heart: Build #2559 succeeded e23c26327db7d42795faed6d72c7456c1532b2ed
- :broken_heart: Build #2558 failed fc3f3e46cb71037c83943e4a1a9c574e04c5241d
- :broken_heart: Build #2557 failed eea9f37ee25a76e2cc89634cc48a7bae6a04ab7d
cc @andsel
Closed because squashed into the parent PR #17229