vector icon indicating copy to clipboard operation
vector copied to clipboard

enhancement(file source): Add optional byte offset to file source events

Open sproberts92 opened this issue 2 years ago • 12 comments

Motivation: If an application produces log lines faster than the resolution of the timestamp used additional information is needed to maintain ordering, and form part of a unique document id used in downstream services such as Elasticsearch.

Off by default, turned on by setting a value in config for offset_key.

This addresses issue #6633.

sproberts92 avatar Jul 03 '22 09:07 sproberts92

CLA assistant check
All committers have signed the CLA.

bits-bot avatar Jul 03 '22 09:07 bits-bot

Deploy Preview for vector-project canceled.

Name Link
Latest commit 1b4387018d63cf994ae6c5cedf21053a7e7bcaff
Latest deploy log https://app.netlify.com/sites/vector-project/deploys/62fba9b2b8c0bd0009017426

netlify[bot] avatar Jul 03 '22 09:07 netlify[bot]

Soak Test Results

Baseline: be0a3571b2437b6bc23133633b8e1c630b583358 Comparison: 1ca60b8412015bf013f1c42eec62440e3e533b10 Total Vector CPUs: 4

Explanation

A soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core.

The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed.

No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%:

Fine details of change detection per experiment.
experiment Δ mean Δ mean % confidence baseline mean baseline stdev baseline stderr baseline outlier % baseline CoV comparison mean comparison stdev comparison stderr comparison outlier % comparison CoV erratic declared erratic
syslog_humio_logs 376.95KiB 2.32 100.00% 15.89MiB 883.6KiB 18.04KiB 0 0.0542789 16.26MiB 781.33KiB 15.98KiB 0 0.0469102 False False
syslog_log2metric_splunk_hec_metrics 141.68KiB 0.82 100.00% 16.84MiB 793.95KiB 16.18KiB 0 0.0460273 16.98MiB 991.02KiB 20.18KiB 0 0.0569835 False False
http_text_to_http_json 332.58KiB 0.79 100.00% 40.98MiB 1.17MiB 24.41KiB 0 0.0284887 41.31MiB 1.27MiB 26.65KiB 0 0.030856 False False
http_pipelines_blackhole_acks 5.49KiB 0.45 94.30% 1.2MiB 113.75KiB 2.31KiB 0 0.0924853 1.21MiB 84.58KiB 1.72KiB 0 0.0684638 False False
datadog_agent_remap_blackhole_acks 246.49KiB 0.39 89.99% 62.51MiB 5.75MiB 119.7KiB 0 0.0919724 62.75MiB 4.31MiB 90.2KiB 0 0.0687136 False False
syslog_regex_logs2metric_ddmetrics 22.92KiB 0.18 85.12% 12.16MiB 534.78KiB 10.9KiB 0 0.0429306 12.18MiB 566.09KiB 11.54KiB 0 0.0453601 False False
splunk_hec_indexer_ack_blackhole 21.96KiB 0.09 60.93% 23.74MiB 925.17KiB 18.81KiB 0 0.0380525 23.76MiB 851.99KiB 17.34KiB 0 0.0350112 False False
splunk_hec_to_splunk_hec_logs_noack 16.29KiB 0.07 84.68% 23.82MiB 452.51KiB 9.23KiB 0 0.0185467 23.84MiB 327.85KiB 6.69KiB 0 0.0134283 False False
splunk_hec_to_splunk_hec_logs_acks 1.65KiB 0.01 5.08% 23.75MiB 884.88KiB 18.0KiB 0 0.0363714 23.76MiB 914.82KiB 18.6KiB 0 0.0375994 False False
syslog_loki -868.15B -0.01 3.74% 14.97MiB 418.91KiB 8.57KiB 0 0.0273227 14.97MiB 783.38KiB 15.92KiB 0 0.0510974 False False
file_to_blackhole -33.32KiB -0.03 28.15% 95.34MiB 3.06MiB 63.51KiB 0 0.0321296 95.31MiB 3.23MiB 67.16KiB 0 0.0338585 False False
datadog_agent_remap_blackhole -26.52KiB -0.04 19.23% 63.7MiB 3.68MiB 76.72KiB 0 0.0576973 63.68MiB 3.71MiB 77.35KiB 0 0.0582463 False False
http_pipelines_no_grok_blackhole -6.59KiB -0.06 23.04% 11.49MiB 44.29KiB 925.88B 0 0.00376326 11.48MiB 1.08MiB 22.48KiB 0 0.0939516 False False
http_to_http_json -29.19KiB -0.12 98.09% 23.84MiB 350.14KiB 7.15KiB 0 0.014337 23.82MiB 498.48KiB 10.19KiB 0 0.0204353 False False
http_pipelines_blackhole -2.45KiB -0.14 60.81% 1.66MiB 69.19KiB 1.41KiB 0 0.0405836 1.66MiB 122.09KiB 2.49KiB 0 0.0717184 False False
fluent_elasticsearch -144.34KiB -0.18 100.00% 79.47MiB 54.12KiB 1.09KiB 0 0.000664894 79.33MiB 1.34MiB 27.55KiB 0 0.0168616 False False
splunk_hec_route_s3 -80.06KiB -0.41 76.62% 19.21MiB 2.32MiB 48.29KiB 0 0.12068 19.13MiB 2.24MiB 46.78KiB 0 0.116807 False False
datadog_agent_remap_datadog_logs_acks -331.27KiB -0.5 99.67% 64.44MiB 2.87MiB 60.14KiB 0 0.0445835 64.12MiB 4.57MiB 95.19KiB 0 0.0713073 False False
syslog_splunk_hec_logs -95.71KiB -0.59 99.22% 15.9MiB 1.19MiB 24.86KiB 0 0.0750257 15.81MiB 1.24MiB 25.94KiB 0 0.0786375 False False
http_to_http_noack -144.34KiB -0.59 100.00% 23.84MiB 250.7KiB 5.13KiB 0 0.0102656 23.7MiB 1.28MiB 26.61KiB 0 0.0538799 False False
http_to_http_acks -126.42KiB -0.69 41.39% 17.93MiB 7.74MiB 161.8KiB 0 0.431473 17.81MiB 7.97MiB 166.49KiB 0 0.447633 True True
datadog_agent_remap_datadog_logs -515.14KiB -0.76 100.00% 66.47MiB 328.63KiB 6.73KiB 0 0.00482736 65.96MiB 4.07MiB 84.79KiB 0 0.0616735 False False
socket_to_socket_blackhole -222.51KiB -1.55 100.00% 13.98MiB 115.52KiB 2.36KiB 0 0.00806584 13.77MiB 114.47KiB 2.34KiB 0 0.00811874 False False
syslog_log2metric_humio_metrics -226.14KiB -1.67 100.00% 13.24MiB 535.4KiB 10.92KiB 0 0.0394754 13.02MiB 640.48KiB 13.04KiB 0 0.0480241 False False

github-actions[bot] avatar Jul 05 '22 17:07 github-actions[bot]

Had an off-by-one issue with the offset passed in the test, the confusion came from the line variable having the newline stripped, but the offset of course still counting it. Makes perfect sense in a real file, but a little less obvious in a test where the line is being created synthetically. Have expanded the comment to make it clearer why the value is one higher.

To demonstrate how it looks when reading from a real file, I added a println to create_event:

$ cat test_log
0123
0123
0123
0123
0123

$ ./target/debug/vector -c config/vector.toml
line_start_offset=0 line_end_offset=5 line_len=4
line_start_offset=5 line_end_offset=10 line_len=4
line_start_offset=10 line_end_offset=15 line_len=4
line_start_offset=15 line_end_offset=20 line_len=4
line_start_offset=20 line_end_offset=25 line_len=4
{"file":"test_log","host":"foo","message":"0123","offset":"0","source_type":"file","
timestamp":"2022-07-06T10:50:06.440885177Z"}
{"file":"test_log","host":"foo","message":"0123","offset":"5","source_type":"file","
timestamp":"2022-07-06T10:50:06.441392827Z"}
{"file":"test_log","host":"foo","message":"0123","offset":"10","source_type":"file",
"timestamp":"2022-07-06T10:50:06.441807939Z"}
{"file":"test_log","host":"foo","message":"0123","offset":"15","source_type":"file",
"timestamp":"2022-07-06T10:50:06.442170874Z"}
{"file":"test_log","host":"foo","message":"0123","offset":"20","source_type":"file",
"timestamp":"2022-07-06T10:50:06.442540067Z"}

Each line has 4 characters, plus a newline. We see that line_len is 4, so without the newline, and the end-of-line offset is 5 as expected.

sproberts92 avatar Jul 06 '22 14:07 sproberts92

It's also worth calling out the fact that I've gone for the offset at the beginning of the line, rather than accepting the end-of-line offset that's already in scope. I feel that's more intuitive for users (myself included), but it's not strictly necessary to fulfill the original motivation. Curious to hear your thoughts on that choice.

sproberts92 avatar Jul 06 '22 14:07 sproberts92

Soak Test Results

Baseline: 5232a05032b709c6951c901c1fcab4f6abedc4b7 Comparison: 67ca5cd19d181f457d8c156876777fa896ffcaa9 Total Vector CPUs: 4

Explanation

A soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core.

The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed.

No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%:

Fine details of change detection per experiment.
experiment Δ mean Δ mean % confidence baseline mean baseline stdev baseline stderr baseline outlier % baseline CoV comparison mean comparison stdev comparison stderr comparison outlier % comparison CoV erratic declared erratic
http_text_to_http_json 1.31MiB 3.27 100.00% 39.92MiB 1.01MiB 21.03KiB 0 0.025197 41.22MiB 1.03MiB 21.56KiB 0 0.0249967 False False
http_pipelines_blackhole_acks 23.14KiB 1.91 100.00% 1.19MiB 103.87KiB 2.11KiB 0 0.085528 1.21MiB 77.95KiB 1.59KiB 0 0.0629873 False False
splunk_hec_route_s3 252.58KiB 1.28 99.99% 19.23MiB 2.21MiB 46.11KiB 0 0.115092 19.47MiB 2.12MiB 44.25KiB 0 0.108643 False False
syslog_loki 125.01KiB 0.82 100.00% 14.84MiB 349.08KiB 7.15KiB 0 0.0229598 14.97MiB 811.98KiB 16.51KiB 0 0.0529695 False False
syslog_log2metric_splunk_hec_metrics 134.8KiB 0.76 100.00% 17.31MiB 817.75KiB 16.65KiB 0 0.0461258 17.44MiB 781.8KiB 15.91KiB 0 0.0437649 False False
http_pipelines_blackhole 10.15KiB 0.63 99.82% 1.58MiB 77.72KiB 1.59KiB 0 0.0480787 1.59MiB 138.83KiB 2.83KiB 0 0.0853448 False False
syslog_splunk_hec_logs 80.47KiB 0.49 99.94% 16.02MiB 853.17KiB 17.37KiB 0 0.0519988 16.1MiB 774.75KiB 15.81KiB 0 0.0469888 False False
syslog_regex_logs2metric_ddmetrics 60.31KiB 0.48 99.99% 12.35MiB 594.54KiB 12.11KiB 0 0.0470145 12.41MiB 493.32KiB 10.06KiB 0 0.0388255 False False
syslog_humio_logs 63.09KiB 0.38 100.00% 16.12MiB 543.43KiB 11.09KiB 0 0.0329158 16.18MiB 521.98KiB 10.7KiB 0 0.0314966 False False
socket_to_socket_blackhole 41.77KiB 0.3 100.00% 13.69MiB 94.67KiB 1.93KiB 0 0.00675094 13.73MiB 119.53KiB 2.44KiB 0 0.00849809 False False
splunk_hec_to_splunk_hec_logs_noack 27.49KiB 0.11 96.47% 23.81MiB 547.06KiB 11.16KiB 0 0.0224332 23.84MiB 331.33KiB 6.76KiB 0 0.0135717 False False
splunk_hec_indexer_ack_blackhole -7.51KiB -0.03 23.60% 23.76MiB 852.63KiB 17.35KiB 0 0.0350374 23.75MiB 886.55KiB 18.04KiB 0 0.0364425 False False
file_to_blackhole -57.85KiB -0.06 48.88% 95.37MiB 2.64MiB 54.8KiB 0 0.0277114 95.31MiB 3.31MiB 68.92KiB 0 0.0347526 False False
splunk_hec_to_splunk_hec_logs_acks -25.49KiB -0.1 70.66% 23.76MiB 787.41KiB 16.02KiB 0 0.0323519 23.74MiB 895.69KiB 18.21KiB 0 0.0368389 False False
http_to_http_json -31.62KiB -0.13 98.68% 23.84MiB 357.01KiB 7.29KiB 0 0.0146197 23.81MiB 512.21KiB 10.46KiB 0 0.021002 False False
fluent_elasticsearch -131.93KiB -0.16 100.00% 79.47MiB 54.12KiB 1.09KiB 0 0.000664833 79.34MiB 1.25MiB 25.67KiB 0 0.0157115 False False
datadog_agent_remap_datadog_logs_acks -174.56KiB -0.27 86.96% 63.19MiB 2.95MiB 61.68KiB 0 0.0466157 63.02MiB 4.68MiB 97.51KiB 0 0.0743236 False False
syslog_log2metric_humio_metrics -57.36KiB -0.42 100.00% 13.45MiB 368.96KiB 7.53KiB 0 0.0267803 13.4MiB 430.07KiB 8.76KiB 0 0.0313467 False False
http_to_http_noack -113.36KiB -0.46 100.00% 23.83MiB 509.47KiB 10.4KiB 0 0.020875 23.72MiB 1.23MiB 25.54KiB 0 0.0516768 False False
http_to_http_acks -84.5KiB -0.46 28.80% 17.89MiB 8.13MiB 169.92KiB 0 0.454214 17.8MiB 7.35MiB 153.37KiB 0 0.412716 True True
datadog_agent_remap_datadog_logs -610.68KiB -0.92 100.00% 64.92MiB 634.22KiB 12.98KiB 0 0.00953773 64.33MiB 4.53MiB 94.33KiB 0 0.070432 False False
http_pipelines_no_grok_blackhole -111.22KiB -0.94 100.00% 11.6MiB 39.29KiB 821.67B 0 0.00330781 11.49MiB 1.08MiB 22.44KiB 0 0.0937333 False False
datadog_agent_remap_blackhole_acks -1.56MiB -2.37 100.00% 65.86MiB 5.77MiB 120.06KiB 0 0.0875565 64.3MiB 4.33MiB 90.59KiB 0 0.0673278 False False
datadog_agent_remap_blackhole -2.03MiB -3.11 100.00% 65.27MiB 4.33MiB 90.17KiB 0 0.0662842 63.24MiB 3.62MiB 75.56KiB 0 0.0572862 False False

github-actions[bot] avatar Jul 06 '22 17:07 github-actions[bot]

@jszwedko Ping !

atibdialpad avatar Jul 08 '22 10:07 atibdialpad

@jszwedko Ping !

Apologies for the delay! We'll get this reviewed next week. The changes look straight-forward enough.

jszwedko avatar Jul 08 '22 21:07 jszwedko

Thanks for the review @bruceg, I should have some time to make these adjustments on the weekend.

sproberts92 avatar Jul 13 '22 14:07 sproberts92

Soak Test Results

Baseline: 788078497177b949d8983bee43e782e43ed4d34c Comparison: 71ca83e0b795886998a4df7ef8ee5dc3b17c4a55 Total Vector CPUs: 4

Explanation

A soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core.

The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed.

No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%:

Fine details of change detection per experiment.
experiment Δ mean Δ mean % confidence baseline mean baseline stdev baseline stderr baseline outlier % baseline CoV comparison mean comparison stdev comparison stderr comparison outlier % comparison CoV erratic declared erratic
http_text_to_http_json 3.03MiB 7.77 100.00% 38.93MiB 896.87KiB 18.3KiB 0 0.0224906 41.96MiB 767.57KiB 15.67KiB 0 0.0178602 False False
datadog_agent_remap_blackhole 2.96MiB 5.24 100.00% 56.6MiB 4.74MiB 98.62KiB 0 0.0836591 59.56MiB 4.55MiB 94.96KiB 0 0.0763825 False False
datadog_agent_remap_datadog_logs_acks 2.45MiB 3.87 100.00% 63.24MiB 3.65MiB 76.29KiB 0 0.0577729 65.69MiB 4.7MiB 97.77KiB 0 0.0714804 False False
datadog_agent_remap_datadog_logs 1.91MiB 3.02 100.00% 63.37MiB 1.05MiB 22.01KiB 0 0.016558 65.29MiB 4.66MiB 97.04KiB 0 0.071375 False False
splunk_hec_route_s3 510.13KiB 2.64 100.00% 18.9MiB 2.24MiB 46.73KiB 0 0.118594 19.4MiB 2.13MiB 44.46KiB 0 0.109568 False False
http_pipelines_blackhole_acks 28.78KiB 2.4 100.00% 1.17MiB 85.77KiB 1.75KiB 0 0.0713743 1.2MiB 62.47KiB 1.27KiB 0 0.0507729 False False
syslog_loki 292.8KiB 1.94 100.00% 14.77MiB 269.07KiB 5.51KiB 0 0.0177885 15.05MiB 767.76KiB 15.61KiB 0 0.0497934 False False
socket_to_socket_blackhole 257.29KiB 1.83 100.00% 13.74MiB 126.38KiB 2.58KiB 0 0.00898072 13.99MiB 124.26KiB 2.54KiB 0 0.00867193 False False
http_pipelines_blackhole 25.08KiB 1.49 100.00% 1.64MiB 16.82KiB 351.94B 0 0.0100224 1.66MiB 91.77KiB 1.87KiB 0 0.0538675 False False
datadog_agent_remap_blackhole_acks 873.44KiB 1.4 100.00% 60.91MiB 5.16MiB 107.46KiB 0 0.0847273 61.77MiB 3.84MiB 80.18KiB 0 0.0620987 False False
http_pipelines_no_grok_blackhole 14.2KiB 0.12 42.10% 11.19MiB 463.6KiB 9.46KiB 0 0.0404373 11.21MiB 1.14MiB 23.79KiB 0 0.101919 False False
splunk_hec_to_splunk_hec_logs_noack 11.12KiB 0.05 67.94% 23.83MiB 433.48KiB 8.85KiB 0 0.017763 23.84MiB 335.78KiB 6.85KiB 0 0.0137534 False False
syslog_log2metric_humio_metrics 6.97KiB 0.05 34.84% 13.0MiB 434.35KiB 8.87KiB 0 0.0326291 13.0MiB 620.1KiB 12.62KiB 0 0.046559 False False
splunk_hec_indexer_ack_blackhole 4.14KiB 0.02 12.17% 23.74MiB 940.28KiB 19.12KiB 0 0.0386692 23.75MiB 938.97KiB 19.1KiB 0 0.0386091 False False
enterprise_http_to_http -1.81KiB -0.01 20.02% 23.85MiB 246.31KiB 5.03KiB 0 0.0100845 23.85MiB 248.98KiB 5.09KiB 0 0.0101946 False False
splunk_hec_to_splunk_hec_logs_acks -6.34KiB -0.03 21.74% 23.76MiB 785.63KiB 15.99KiB 0 0.0322794 23.76MiB 811.43KiB 16.51KiB 0 0.0333483 False False
file_to_blackhole -63.47KiB -0.07 50.11% 95.35MiB 2.91MiB 60.38KiB 0 0.0305418 95.29MiB 3.46MiB 71.86KiB 0 0.0362599 False False
http_to_http_json -35.48KiB -0.15 99.52% 23.85MiB 342.59KiB 6.99KiB 0 0.0140257 23.81MiB 510.94KiB 10.44KiB 0 0.0209485 False False
fluent_elasticsearch -231.24KiB -0.28 100.00% 79.47MiB 55.11KiB 1.11KiB 0 0.000677047 79.25MiB 1.87MiB 38.41KiB 0 0.0235697 False False
http_to_http_noack -131.18KiB -0.54 100.00% 23.84MiB 406.84KiB 8.31KiB 0 0.0166642 23.71MiB 1.26MiB 26.32KiB 0 0.0532825 False False
http_to_http_acks -632.15KiB -3.37 99.56% 18.33MiB 7.11MiB 148.77KiB 0 0.388104 17.71MiB 7.9MiB 164.96KiB 0 0.44625 True True
syslog_regex_logs2metric_ddmetrics -528.02KiB -3.98 100.00% 12.94MiB 640.32KiB 13.04KiB 0 0.0483077 12.43MiB 578.16KiB 11.78KiB 0 0.0454281 False False
syslog_splunk_hec_logs -759.7KiB -4.37 100.00% 16.97MiB 735.64KiB 14.98KiB 0 0.042321 16.23MiB 726.65KiB 14.81KiB 0 0.0437148 False False
syslog_humio_logs -989.49KiB -5.65 100.00% 17.1MiB 310.37KiB 6.33KiB 0 0.0177241 16.13MiB 231.44KiB 4.74KiB 0 0.0140082 False False
syslog_log2metric_splunk_hec_metrics -1.35MiB -7.11 100.00% 19.02MiB 644.9KiB 13.14KiB 0 0.0331069 17.67MiB 836.24KiB 17.01KiB 0 0.0462136 False False

github-actions[bot] avatar Jul 19 '22 16:07 github-actions[bot]

Hi @sproberts92 ! Is this ready for re-review? Feel free to tag when it is.

jszwedko avatar Jul 26 '22 22:07 jszwedko

Hi @jszwedko, @bruceg, thanks for your patience - it can be hard to find the time to work on this in the evenings. This is ready for you to take another look now.

I've made a change to record the line's byte offset at the time it is read instead of trying to calculate it later. Hopefully this is along the lines of what you had in mind @bruceg.

sproberts92 avatar Aug 02 '22 13:08 sproberts92

Apologies @bruceg, I did not realise that cargo clippy does not include the tests by default. Resolved now.

sproberts92 avatar Aug 12 '22 10:08 sproberts92

At first glance the failures in the soak test don't appear to be related to my changes. Is this something you've encountered before @bruceg @jszwedko ?

sproberts92 avatar Aug 12 '22 16:08 sproberts92

Ah, yes, there was a change to master that caused a configuration breakage in the soaks. Merging with current master will fix that.

bruceg avatar Aug 12 '22 17:08 bruceg

Have rebased onto current master.

sproberts92 avatar Aug 12 '22 17:08 sproberts92

Soak Test Results

Baseline: 64f6724155e34fde78f5c4fdcfef3300b6d02e2a Comparison: 2ba34b7ea00b2b74ae5224e6037905ee05e38131 Total Vector CPUs: 4

Explanation

A soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core.

The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed.

No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%:

Fine details of change detection per experiment.
experiment Δ mean Δ mean % confidence baseline mean baseline stdev baseline stderr baseline outlier % baseline CoV comparison mean comparison stdev comparison stderr comparison outlier % comparison CoV erratic declared erratic
syslog_log2metric_splunk_hec_metrics 700.09KiB 4.04 100.00% 16.94MiB 979.33KiB 19.97KiB 0 0.0564339 17.63MiB 948.55KiB 19.3KiB 0 0.0525404 False False
syslog_splunk_hec_logs 576.34KiB 3.55 100.00% 15.85MiB 944.32KiB 19.22KiB 0 0.0581565 16.42MiB 678.85KiB 13.84KiB 0 0.0403739 False False
syslog_regex_logs2metric_ddmetrics 349.1KiB 2.91 100.00% 11.72MiB 642.36KiB 13.08KiB 0 0.0535239 12.06MiB 671.7KiB 13.7KiB 0 0.0543864 False False
splunk_hec_route_s3 486.32KiB 2.63 100.00% 18.03MiB 2.45MiB 51.05KiB 0 0.135892 18.5MiB 2.35MiB 49.14KiB 0 0.126904 False False
syslog_humio_logs 201.94KiB 1.23 100.00% 16.09MiB 380.56KiB 7.77KiB 0 0.0230887 16.29MiB 369.8KiB 7.57KiB 0 0.0221639 False False
datadog_agent_remap_blackhole 656.47KiB 1.1 100.00% 58.08MiB 3.79MiB 78.9KiB 0 0.0651706 58.72MiB 2.86MiB 59.73KiB 0 0.0487449 False False
datadog_agent_remap_datadog_logs_acks 552.19KiB 0.87 100.00% 61.8MiB 3.19MiB 66.62KiB 0 0.0515616 62.34MiB 4.46MiB 92.85KiB 0 0.0715407 False False
http_pipelines_blackhole_acks 9.14KiB 0.77 99.78% 1.15MiB 109.95KiB 2.24KiB 0 0.093099 1.16MiB 96.31KiB 1.96KiB 0 0.0809196 False False
datadog_agent_remap_blackhole_acks 229.93KiB 0.37 97.83% 60.46MiB 4.13MiB 86.06KiB 0 0.0683387 60.68MiB 2.44MiB 51.13KiB 0 0.0402336 False False
datadog_agent_remap_datadog_logs 179.52KiB 0.29 95.84% 60.73MiB 572.83KiB 11.73KiB 0 0.00920896 60.91MiB 4.19MiB 87.26KiB 0 0.0687963 False False
http_to_http_acks 44.42KiB 0.25 14.70% 17.44MiB 8.21MiB 171.56KiB 0 0.470629 17.48MiB 8.02MiB 167.48KiB 0 0.458796 True True
syslog_loki 9.92KiB 0.07 36.61% 14.05MiB 569.68KiB 11.66KiB 0 0.0395951 14.06MiB 849.27KiB 17.26KiB 0 0.0589876 False False
splunk_hec_to_splunk_hec_logs_acks 9.98KiB 0.04 32.28% 23.75MiB 852.05KiB 17.33KiB 0 0.0350261 23.76MiB 813.41KiB 16.55KiB 0 0.0334241 False False
splunk_hec_to_splunk_hec_logs_noack 9.89KiB 0.04 62.58% 23.83MiB 428.91KiB 8.75KiB 0 0.0175746 23.84MiB 336.69KiB 6.87KiB 0 0.0137901 False False
enterprise_http_to_http -3.05KiB -0.01 31.52% 23.85MiB 259.4KiB 5.29KiB 0 0.0106206 23.84MiB 260.47KiB 5.33KiB 0 0.0106659 False False
splunk_hec_indexer_ack_blackhole -1.79KiB -0.01 5.64% 23.75MiB 864.88KiB 17.6KiB 0 0.035549 23.75MiB 889.95KiB 18.11KiB 0 0.0365821 False False
file_to_blackhole -60.3KiB -0.06 44.36% 95.34MiB 3.13MiB 64.95KiB 0 0.032858 95.28MiB 3.81MiB 79.3KiB 0 0.0399997 False False
http_to_http_json -22.93KiB -0.09 93.32% 23.84MiB 372.13KiB 7.6KiB 0 0.0152407 23.82MiB 486.03KiB 9.93KiB 0 0.0199243 False False
http_pipelines_blackhole -4.58KiB -0.28 92.52% 1.63MiB 53.6KiB 1.1KiB 0 0.0321997 1.62MiB 114.16KiB 2.33KiB 0 0.0687673 False False
http_to_http_noack -121.4KiB -0.5 100.00% 23.84MiB 258.92KiB 5.3KiB 0 0.010602 23.73MiB 1.18MiB 24.58KiB 0 0.0496972 False False
fluent_elasticsearch -442.89KiB -0.54 100.00% 79.47MiB 54.41KiB 1.1KiB 0 0.000668492 79.04MiB 4.48MiB 91.96KiB 0 0.0566508 False False
http_text_to_http_json -342.68KiB -0.87 100.00% 38.56MiB 1.03MiB 21.51KiB 0 0.0266796 38.22MiB 877.63KiB 17.92KiB 0 0.0224174 False False
http_pipelines_no_grok_blackhole -100.12KiB -0.9 100.00% 10.9MiB 324.24KiB 6.62KiB 0 0.029046 10.8MiB 1.09MiB 22.72KiB 0 0.100991 False False
syslog_log2metric_humio_metrics -131.36KiB -0.98 100.00% 13.11MiB 314.23KiB 6.41KiB 0 0.0233955 12.99MiB 443.37KiB 9.03KiB 0 0.0333365 False False
socket_to_socket_blackhole -630.72KiB -2.77 100.00% 22.21MiB 696.72KiB 14.22KiB 0 0.0306322 21.59MiB 715.36KiB 14.61KiB 0 0.0323492 False False

github-actions[bot] avatar Aug 12 '22 22:08 github-actions[bot]

Soak Test Results

Baseline: fee0950f2c0b8fd53c816954d3ca7bb97ddda357 Comparison: 1b4387018d63cf994ae6c5cedf21053a7e7bcaff Total Vector CPUs: 4

Explanation

A soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core.

The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed.

No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%:

Fine details of change detection per experiment.
experiment Δ mean Δ mean % confidence baseline mean baseline stdev baseline stderr baseline outlier % baseline CoV comparison mean comparison stdev comparison stderr comparison outlier % comparison CoV erratic declared erratic
syslog_splunk_hec_logs 124.31KiB 0.73 100.00% 16.55MiB 710.15KiB 14.46KiB 0 0.0418832 16.68MiB 594.89KiB 12.14KiB 0 0.0348302 False False
syslog_log2metric_splunk_hec_metrics 128.75KiB 0.72 100.00% 17.52MiB 962.93KiB 19.62KiB 0 0.0536734 17.64MiB 950.5KiB 19.37KiB 0 0.0526034 False False
datadog_agent_remap_blackhole 348.18KiB 0.54 99.80% 62.86MiB 4.28MiB 89.21KiB 0 0.0681087 63.2MiB 3.27MiB 68.31KiB 0 0.0518012 False False
http_pipelines_blackhole_acks 5.86KiB 0.47 98.21% 1.23MiB 109.06KiB 2.22KiB 0 0.086724 1.23MiB 53.3KiB 1.09KiB 0 0.0421876 False False
http_pipelines_no_grok_blackhole 46.86KiB 0.43 95.50% 10.65MiB 461.2KiB 9.41KiB 0 0.04229 10.69MiB 1.03MiB 21.38KiB 0 0.0959591 False False
syslog_regex_logs2metric_ddmetrics 36.25KiB 0.29 98.87% 12.36MiB 521.78KiB 10.64KiB 0 0.0412311 12.39MiB 469.2KiB 9.57KiB 0 0.0369696 False False
syslog_humio_logs 42.67KiB 0.25 100.00% 16.61MiB 101.87KiB 2.08KiB 0 0.00598948 16.65MiB 119.49KiB 2.45KiB 0 0.00700795 False False
splunk_hec_route_s3 9.28KiB 0.05 11.33% 18.96MiB 2.25MiB 46.84KiB 0 0.118488 18.97MiB 2.17MiB 45.31KiB 0 0.114169 False False
splunk_hec_to_splunk_hec_logs_noack 9.5KiB 0.04 61.63% 23.83MiB 420.19KiB 8.58KiB 0 0.0172185 23.84MiB 329.75KiB 6.73KiB 0 0.013507 False False
enterprise_http_to_http -820.53B -0 8.43% 23.85MiB 259.67KiB 5.3KiB 0 0.0106319 23.85MiB 264.09KiB 5.4KiB 0 0.0108131 False False
splunk_hec_to_splunk_hec_logs_acks -10.77KiB -0.04 34.96% 23.76MiB 802.86KiB 16.34KiB 0 0.032991 23.75MiB 847.76KiB 17.24KiB 0 0.0348512 False False
splunk_hec_indexer_ack_blackhole -12.79KiB -0.05 37.55% 23.76MiB 873.94KiB 17.78KiB 0 0.0359158 23.75MiB 941.05KiB 19.14KiB 0 0.0386938 False False
file_to_blackhole -69.63KiB -0.07 49.69% 95.34MiB 3.24MiB 67.18KiB 0 0.0339845 95.27MiB 3.82MiB 79.37KiB 0 0.0400442 False False
http_to_http_json -21.04KiB -0.09 91.57% 23.84MiB 350.68KiB 7.16KiB 0 0.0143616 23.82MiB 482.31KiB 9.86KiB 0 0.0197689 False False
datadog_agent_remap_blackhole_acks -148.26KiB -0.25 64.36% 58.43MiB 5.94MiB 123.76KiB 0 0.101712 58.29MiB 4.91MiB 102.58KiB 0 0.0841599 False False
http_to_http_noack -86.25KiB -0.35 99.99% 23.84MiB 256.18KiB 5.24KiB 0 0.0104896 23.76MiB 1.01MiB 21.13KiB 0 0.0426329 False False
fluent_elasticsearch -387.75KiB -0.48 100.00% 79.47MiB 53.18KiB 1.08KiB 0 0.000653343 79.09MiB 4.23MiB 86.94KiB 0 0.05348 False False
datadog_agent_remap_datadog_logs_acks -627.83KiB -0.98 100.00% 62.5MiB 3.54MiB 73.93KiB 0 0.0566237 61.89MiB 4.47MiB 92.97KiB 0 0.0721517 False False
http_to_http_acks -185.38KiB -1.04 58.02% 17.49MiB 7.98MiB 166.88KiB 0 0.456485 17.3MiB 7.56MiB 157.89KiB 0 0.436979 True True
syslog_log2metric_humio_metrics -208.06KiB -1.57 100.00% 12.96MiB 240.14KiB 4.9KiB 0 0.0180846 12.76MiB 537.13KiB 10.93KiB 0 0.0410946 False False
http_pipelines_blackhole -27.89KiB -1.63 100.00% 1.67MiB 86.01KiB 1.76KiB 0 0.0503533 1.64MiB 135.56KiB 2.76KiB 0 0.0806777 False False
http_text_to_http_json -1.2MiB -3.06 100.00% 39.32MiB 1.12MiB 23.48KiB 0 0.0285645 38.12MiB 1.16MiB 24.23KiB 0 0.0304035 False False
datadog_agent_remap_datadog_logs -1.95MiB -3.16 100.00% 61.67MiB 2.21MiB 46.26KiB 0 0.0357793 59.72MiB 4.51MiB 94.03KiB 0 0.0755585 False False
syslog_loki -723.91KiB -4.72 100.00% 14.97MiB 404.33KiB 8.27KiB 0 0.0263639 14.27MiB 722.99KiB 14.7KiB 0 0.0494773 False False
socket_to_socket_blackhole -1.49MiB -6.28 100.00% 23.77MiB 1.02MiB 21.35KiB 0 0.0429771 22.27MiB 939.29KiB 19.18KiB 0 0.0411723 False False

github-actions[bot] avatar Aug 16 '22 16:08 github-actions[bot]

Thanks @sproberts92 ! I agree with your late assessment that the offset should be an integer. Thanks for making that change.

jszwedko avatar Aug 16 '22 20:08 jszwedko