vector
vector copied to clipboard
enhancement(throttle transform): Allow throttling by bytes
Implements #11854.
This is my first time programming in rust so some things may be strange. Feedback is greatly appreciated.
I started looking into updating the .cue files for documentation. I saw there are reusable components here, such as the encoding
configuration I would like to use. However, this currently is only scoped to sinks. The fact that I am using it here in a transform is totally new. I'm not sure whether this is a sign that I am doing something in a very strange way, or if I should just update the necessary pieces to support the encoding
feature within transforms. If someone can help me answer this, it would be a big help for me.
Deploy Preview for vector-project ready!
Name | Link |
---|---|
Latest commit | 4399f352c343517ae82e0c02ec6209f6e025552f |
Latest deploy log | https://app.netlify.com/sites/vector-project/deploys/63e3d80922e98d000897211f |
Deploy Preview | https://deploy-preview-14280--vector-project.netlify.app |
Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify site settings.
Soak Test Results
Baseline: 2586b5277795962a75330bb2828947706e09e2ad Comparison: cb691eba4a22f44ab62c56290b76cc6b1a91425f Total Vector CPUs: 4
Explanation
A soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core.
The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a Β±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed.
No interesting changes in throughput with confidence β₯ 90.00% and absolute Ξ mean >= Β±8.87%:
Fine details of change detection per experiment.
experiment | Ξ mean | Ξ mean % | confidence | baseline mean | baseline stdev | baseline stderr | baseline outlier % | baseline CoV | comparison mean | comparison stdev | comparison stderr | comparison outlier % | comparison CoV | erratic | declared erratic |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
syslog_splunk_hec_logs | 608.32KiB | 3.77 | 100.00% | 15.75MiB | 1.03MiB | 21.38KiB | 0 | 0.0651153 | 16.34MiB | 930.16KiB | 18.96KiB | 0 | 0.0555667 | False | False |
datadog_agent_remap_blackhole | 2.15MiB | 3.51 | 100.00% | 61.35MiB | 4.05MiB | 84.38KiB | 0 | 0.0659798 | 63.5MiB | 3.23MiB | 67.36KiB | 0 | 0.0508445 | False | False |
datadog_agent_remap_blackhole_acks | 1.83MiB | 2.98 | 100.00% | 61.19MiB | 3.95MiB | 82.27KiB | 0 | 0.064524 | 63.02MiB | 3.83MiB | 79.82KiB | 0 | 0.0607075 | False | False |
syslog_loki | 321.13KiB | 2.17 | 100.00% | 14.47MiB | 613.04KiB | 12.56KiB | 0 | 0.0413662 | 14.78MiB | 918.5KiB | 18.67KiB | 0 | 0.0606632 | False | False |
syslog_log2metric_humio_metrics | 248.02KiB | 1.97 | 100.00% | 12.3MiB | 757.88KiB | 15.46KiB | 0 | 0.0601492 | 12.54MiB | 759.87KiB | 15.49KiB | 0 | 0.059143 | False | False |
http_pipelines_blackhole_acks | 22.79KiB | 1.88 | 100.00% | 1.18MiB | 131.4KiB | 2.67KiB | 0 | 0.108566 | 1.2MiB | 103.97KiB | 2.12KiB | 0 | 0.0843155 | False | False |
syslog_regex_logs2metric_ddmetrics | 229.05KiB | 1.8 | 100.00% | 12.42MiB | 728.46KiB | 14.84KiB | 0 | 0.0572878 | 12.64MiB | 643.74KiB | 13.12KiB | 0 | 0.0497294 | False | False |
syslog_humio_logs | 186.8KiB | 1.14 | 100.00% | 16.05MiB | 833.03KiB | 17.01KiB | 0 | 0.0506664 | 16.24MiB | 767.09KiB | 15.71KiB | 0 | 0.0461313 | False | False |
splunk_hec_route_s3 | 150.87KiB | 0.78 | 98.20% | 18.81MiB | 2.17MiB | 45.26KiB | 0 | 0.115517 | 18.96MiB | 2.15MiB | 44.91KiB | 0 | 0.113142 | False | False |
datadog_agent_remap_datadog_logs_acks | 395.18KiB | 0.63 | 99.95% | 61.0MiB | 3.34MiB | 69.66KiB | 0 | 0.0546646 | 61.39MiB | 4.32MiB | 89.87KiB | 0 | 0.0703187 | False | False |
http_to_http_acks | 92.46KiB | 0.52 | 30.65% | 17.32MiB | 8.26MiB | 172.63KiB | 0 | 0.476605 | 17.41MiB | 7.61MiB | 158.9KiB | 0 | 0.437065 | True | True |
http_pipelines_no_grok_blackhole | 46.35KiB | 0.41 | 96.34% | 10.93MiB | 55.81KiB | 1.14KiB | 0 | 0.00498346 | 10.98MiB | 1.06MiB | 22.13KiB | 0 | 0.0967773 | False | False |
datadog_agent_remap_datadog_logs | 78.02KiB | 0.13 | 60.47% | 59.95MiB | 1.31MiB | 27.45KiB | 0 | 0.0218366 | 60.03MiB | 4.21MiB | 87.57KiB | 0 | 0.0700408 | False | False |
splunk_hec_to_splunk_hec_logs_noack | 6.65KiB | 0.03 | 47.83% | 23.83MiB | 380.68KiB | 7.78KiB | 0 | 0.015596 | 23.84MiB | 336.72KiB | 6.87KiB | 0 | 0.0137913 | False | False |
enterprise_http_to_http | -1.06KiB | -0 | 11.55% | 23.85MiB | 251.57KiB | 5.14KiB | 0 | 0.0103004 | 23.84MiB | 254.99KiB | 5.22KiB | 0 | 0.010441 | False | False |
splunk_hec_indexer_ack_blackhole | 517.05B | 0 | 1.47% | 23.74MiB | 956.97KiB | 19.46KiB | 0 | 0.0393575 | 23.74MiB | 951.56KiB | 19.35KiB | 0 | 0.0391342 | False | False |
splunk_hec_to_splunk_hec_logs_acks | -6.48KiB | -0.03 | 20.32% | 23.75MiB | 863.01KiB | 17.55KiB | 0 | 0.0354779 | 23.74MiB | 886.87KiB | 18.04KiB | 0 | 0.0364683 | False | False |
file_to_blackhole | -53.15KiB | -0.05 | 49.64% | 95.37MiB | 2.59MiB | 53.59KiB | 0 | 0.0271021 | 95.31MiB | 2.82MiB | 58.67KiB | 0 | 0.0295701 | False | False |
http_to_http_json | -49.9KiB | -0.2 | 99.97% | 23.85MiB | 328.73KiB | 6.71KiB | 0 | 0.013459 | 23.8MiB | 585.38KiB | 11.94KiB | 0 | 0.0240158 | False | False |
fluent_elasticsearch | -209.01KiB | -0.26 | 100.00% | 79.47MiB | 53.75KiB | 1.09KiB | 0 | 0.00066033 | 79.27MiB | 1.74MiB | 35.84KiB | 0 | 0.0219808 | False | False |
http_to_http_noack | -135.69KiB | -0.56 | 100.00% | 23.84MiB | 265.4KiB | 5.43KiB | 0 | 0.0108673 | 23.71MiB | 1.24MiB | 25.87KiB | 0 | 0.052353 | False | False |
http_text_to_http_json | -317.18KiB | -0.79 | 100.00% | 39.11MiB | 1.4MiB | 29.31KiB | 0 | 0.0358426 | 38.8MiB | 1.01MiB | 21.12KiB | 0 | 0.0260347 | False | False |
http_pipelines_blackhole | -14.65KiB | -0.87 | 100.00% | 1.65MiB | 10.51KiB | 219.88B | 0 | 0.00621472 | 1.64MiB | 113.0KiB | 2.3KiB | 0 | 0.0674235 | False | False |
syslog_log2metric_splunk_hec_metrics | -257.92KiB | -1.42 | 100.00% | 17.69MiB | 884.08KiB | 18.01KiB | 0 | 0.0487961 | 17.44MiB | 1.16MiB | 24.17KiB | 0 | 0.0664929 | False | False |
socket_to_socket_blackhole | -445.29KiB | -1.91 | 100.00% | 22.82MiB | 701.42KiB | 14.32KiB | 0 | 0.0300077 | 22.39MiB | 642.29KiB | 13.11KiB | 0 | 0.0280121 | False | False |
Soak Test Results
Baseline: c745d2501e679fbc5d505a2f54c42a4e72f09a23 Comparison: f32ee88727fb0fccba3b4d680ee5c8d1fd8da084 Total Vector CPUs: 4
Explanation
A soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core.
The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a Β±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed.
No interesting changes in throughput with confidence β₯ 90.00% and absolute Ξ mean >= Β±8.87%:
Fine details of change detection per experiment.
experiment | Ξ mean | Ξ mean % | confidence | baseline mean | baseline stdev | baseline stderr | baseline outlier % | baseline CoV | comparison mean | comparison stdev | comparison stderr | comparison outlier % | comparison CoV | erratic | declared erratic |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
datadog_agent_remap_blackhole | 3.04MiB | 5.1 | 100.00% | 59.6MiB | 4.05MiB | 84.44KiB | 0 | 0.0679976 | 62.64MiB | 3.2MiB | 66.73KiB | 0 | 0.0510647 | False | False |
datadog_agent_remap_blackhole_acks | 2.58MiB | 4.22 | 100.00% | 61.18MiB | 4.25MiB | 88.55KiB | 0 | 0.0695017 | 63.76MiB | 3.67MiB | 76.57KiB | 0 | 0.0575279 | False | False |
syslog_loki | 435.66KiB | 2.91 | 100.00% | 14.61MiB | 279.27KiB | 5.72KiB | 0 | 0.0186607 | 15.04MiB | 757.43KiB | 15.4KiB | 0 | 0.0491792 | False | False |
syslog_log2metric_humio_metrics | 317.2KiB | 2.45 | 100.00% | 12.66MiB | 189.34KiB | 3.86KiB | 0 | 0.0146012 | 12.97MiB | 514.69KiB | 10.48KiB | 0 | 0.038742 | False | False |
datadog_agent_remap_datadog_logs_acks | 1.18MiB | 2 | 100.00% | 58.84MiB | 3.54MiB | 73.82KiB | 0 | 0.0600825 | 60.02MiB | 4.36MiB | 90.7KiB | 0 | 0.0725841 | False | False |
http_pipelines_no_grok_blackhole | 155.56KiB | 1.42 | 100.00% | 10.72MiB | 113.04KiB | 2.31KiB | 0 | 0.0102972 | 10.87MiB | 1.06MiB | 22.11KiB | 0 | 0.0976729 | False | False |
datadog_agent_remap_datadog_logs | 872.47KiB | 1.4 | 100.00% | 60.85MiB | 935.09KiB | 19.15KiB | 0 | 0.0150034 | 61.7MiB | 4.19MiB | 87.31KiB | 0 | 0.0679362 | False | False |
syslog_splunk_hec_logs | 185.8KiB | 1.11 | 100.00% | 16.33MiB | 750.24KiB | 15.27KiB | 0 | 0.0448626 | 16.51MiB | 506.66KiB | 10.35KiB | 0 | 0.0299641 | False | False |
http_text_to_http_json | 433.98KiB | 1.1 | 100.00% | 38.45MiB | 852.36KiB | 17.4KiB | 0 | 0.0216424 | 38.88MiB | 798.22KiB | 16.29KiB | 0 | 0.0200466 | False | False |
syslog_humio_logs | 83.28KiB | 0.5 | 100.00% | 16.42MiB | 424.24KiB | 8.67KiB | 0 | 0.0252312 | 16.5MiB | 434.34KiB | 8.9KiB | 0 | 0.0257045 | False | False |
syslog_log2metric_splunk_hec_metrics | 87.73KiB | 0.48 | 100.00% | 17.91MiB | 615.53KiB | 12.54KiB | 0 | 0.0335531 | 18.0MiB | 860.02KiB | 17.49KiB | 0 | 0.0466573 | False | False |
splunk_hec_route_s3 | 65.97KiB | 0.34 | 67.42% | 18.72MiB | 2.29MiB | 47.62KiB | 0 | 0.1222 | 18.78MiB | 2.26MiB | 47.32KiB | 0 | 0.120461 | False | False |
http_to_http_acks | 53.46KiB | 0.3 | 18.37% | 17.39MiB | 7.92MiB | 165.53KiB | 0 | 0.455213 | 17.45MiB | 7.67MiB | 159.94KiB | 0 | 0.439708 | True | True |
http_pipelines_blackhole_acks | 1.07KiB | 0.09 | 23.84% | 1.18MiB | 133.71KiB | 2.72KiB | 0 | 0.110355 | 1.18MiB | 109.26KiB | 2.23KiB | 0 | 0.0900974 | False | False |
splunk_hec_indexer_ack_blackhole | 17.78KiB | 0.07 | 49.42% | 23.74MiB | 959.33KiB | 19.51KiB | 0 | 0.039455 | 23.76MiB | 896.93KiB | 18.25KiB | 0 | 0.0368617 | False | False |
splunk_hec_to_splunk_hec_logs_noack | 4.12KiB | 0.02 | 30.70% | 23.83MiB | 388.99KiB | 7.94KiB | 0 | 0.0159355 | 23.84MiB | 331.58KiB | 6.77KiB | 0 | 0.0135813 | False | False |
enterprise_http_to_http | -1.72KiB | -0.01 | 18.38% | 23.85MiB | 253.93KiB | 5.18KiB | 0 | 0.0103973 | 23.84MiB | 257.76KiB | 5.27KiB | 0 | 0.0105547 | False | False |
file_to_blackhole | -42.48KiB | -0.04 | 34.82% | 95.34MiB | 2.94MiB | 61.01KiB | 0 | 0.0308625 | 95.3MiB | 3.44MiB | 71.69KiB | 0 | 0.0361286 | False | False |
splunk_hec_to_splunk_hec_logs_acks | -18.74KiB | -0.08 | 55.95% | 23.76MiB | 810.07KiB | 16.48KiB | 0 | 0.0332828 | 23.75MiB | 877.27KiB | 17.84KiB | 0 | 0.0360716 | False | False |
http_to_http_json | -54.01KiB | -0.22 | 99.99% | 23.85MiB | 333.64KiB | 6.81KiB | 0 | 0.0136599 | 23.79MiB | 605.15KiB | 12.34KiB | 0 | 0.024831 | False | False |
fluent_elasticsearch | -209.61KiB | -0.26 | 100.00% | 79.47MiB | 53.82KiB | 1.09KiB | 0 | 0.000661234 | 79.27MiB | 1.89MiB | 38.99KiB | 0 | 0.0238994 | False | False |
http_to_http_noack | -69.04KiB | -0.28 | 99.67% | 23.83MiB | 522.25KiB | 10.68KiB | 0 | 0.0213981 | 23.76MiB | 1.0MiB | 20.92KiB | 0 | 0.04219 | False | False |
http_pipelines_blackhole | -5.98KiB | -0.35 | 99.28% | 1.68MiB | 36.26KiB | 758.45B | 0 | 0.0210701 | 1.67MiB | 102.86KiB | 2.1KiB | 0 | 0.0599896 | False | False |
syslog_regex_logs2metric_ddmetrics | -133.19KiB | -1.03 | 100.00% | 12.59MiB | 639.01KiB | 13.02KiB | 0 | 0.0495621 | 12.46MiB | 727.28KiB | 14.82KiB | 0 | 0.0569971 | False | False |
socket_to_socket_blackhole | -739.41KiB | -3.18 | 100.00% | 22.7MiB | 895.33KiB | 18.28KiB | 0 | 0.038516 | 21.97MiB | 1.08MiB | 22.54KiB | 0 | 0.0490571 | False | False |
Soak Test Results
Baseline: 2c585892f243ff7eb74456e3ffbb96d292f16cb4 Comparison: b65250d408b14efe21e22b48d67ea6e9662f976a Total Vector CPUs: 4
Explanation
A soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core.
The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a Β±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed.
No interesting changes in throughput with confidence β₯ 90.00% and absolute Ξ mean >= Β±8.87%:
Fine details of change detection per experiment.
experiment | Ξ mean | Ξ mean % | confidence | baseline mean | baseline stdev | baseline stderr | baseline outlier % | baseline CoV | comparison mean | comparison stdev | comparison stderr | comparison outlier % | comparison CoV | erratic | declared erratic |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
datadog_agent_remap_blackhole_acks | 2.98MiB | 4.84 | 100.00% | 61.57MiB | 4.13MiB | 86.04KiB | 0 | 0.067073 | 64.56MiB | 2.13MiB | 44.52KiB | 0 | 0.0329122 | False | False |
datadog_agent_remap_blackhole | 2.52MiB | 4.14 | 100.00% | 60.74MiB | 4.28MiB | 89.05KiB | 0 | 0.0703917 | 63.25MiB | 2.91MiB | 60.8KiB | 0 | 0.0460502 | False | False |
syslog_loki | 529.09KiB | 3.77 | 100.00% | 13.72MiB | 558.21KiB | 11.43KiB | 0 | 0.0397193 | 14.24MiB | 882.18KiB | 17.93KiB | 0 | 0.0604933 | False | False |
http_text_to_http_json | 1.35MiB | 3.58 | 100.00% | 37.62MiB | 848.68KiB | 17.32KiB | 0 | 0.0220248 | 38.97MiB | 835.33KiB | 17.05KiB | 0 | 0.0209285 | False | False |
datadog_agent_remap_datadog_logs | 2.11MiB | 3.46 | 100.00% | 60.85MiB | 3.22MiB | 67.49KiB | 0 | 0.0529424 | 62.95MiB | 4.23MiB | 88.18KiB | 0 | 0.0672508 | False | False |
http_to_http_acks | 559.44KiB | 3.24 | 98.06% | 16.84MiB | 8.04MiB | 168.12KiB | 0 | 0.477456 | 17.38MiB | 8.15MiB | 170.15KiB | 0 | 0.468428 | True | True |
syslog_splunk_hec_logs | 458.66KiB | 2.82 | 100.00% | 15.88MiB | 891.7KiB | 18.16KiB | 0 | 0.0548111 | 16.33MiB | 708.77KiB | 14.46KiB | 0 | 0.042372 | False | False |
syslog_log2metric_splunk_hec_metrics | 310.98KiB | 1.75 | 100.00% | 17.33MiB | 722.75KiB | 14.73KiB | 0 | 0.0407113 | 17.64MiB | 870.69KiB | 17.73KiB | 0 | 0.0482003 | False | False |
syslog_regex_logs2metric_ddmetrics | 197.01KiB | 1.55 | 100.00% | 12.45MiB | 581.01KiB | 11.84KiB | 0 | 0.0455601 | 12.64MiB | 500.1KiB | 10.2KiB | 0 | 0.038619 | False | False |
syslog_humio_logs | 254.0KiB | 1.52 | 100.00% | 16.27MiB | 239.44KiB | 4.89KiB | 0 | 0.0143686 | 16.52MiB | 153.4KiB | 3.14KiB | 0 | 0.00906698 | False | False |
splunk_hec_route_s3 | 183.93KiB | 0.96 | 99.49% | 18.64MiB | 2.27MiB | 47.19KiB | 0 | 0.121524 | 18.82MiB | 2.18MiB | 45.55KiB | 0 | 0.115646 | False | False |
http_pipelines_blackhole_acks | 9.2KiB | 0.74 | 99.97% | 1.21MiB | 103.33KiB | 2.1KiB | 0 | 0.0833401 | 1.22MiB | 68.97KiB | 1.41KiB | 0 | 0.0552168 | False | False |
http_pipelines_no_grok_blackhole | 46.47KiB | 0.42 | 95.84% | 10.77MiB | 116.23KiB | 2.37KiB | 0 | 0.0105348 | 10.82MiB | 1.09MiB | 22.67KiB | 0 | 0.10065 | False | False |
syslog_log2metric_humio_metrics | 44.64KiB | 0.35 | 98.76% | 12.52MiB | 565.53KiB | 11.54KiB | 0 | 0.0440999 | 12.56MiB | 668.34KiB | 13.62KiB | 0 | 0.0519357 | False | False |
splunk_hec_indexer_ack_blackhole | 23.33KiB | 0.1 | 64.39% | 23.74MiB | 925.39KiB | 18.82KiB | 0 | 0.0380603 | 23.76MiB | 829.82KiB | 16.89KiB | 0 | 0.0340967 | False | False |
splunk_hec_to_splunk_hec_logs_noack | 10.4KiB | 0.04 | 64.41% | 23.83MiB | 436.09KiB | 8.9KiB | 0 | 0.0178696 | 23.84MiB | 337.74KiB | 6.89KiB | 0 | 0.0138336 | False | False |
http_pipelines_blackhole | -2.01B | -0 | 0.08% | 1.66MiB | 22.85KiB | 478.44B | 0 | 0.0134438 | 1.66MiB | 87.96KiB | 1.79KiB | 0 | 0.0517506 | False | False |
splunk_hec_to_splunk_hec_logs_acks | -5.2KiB | -0.02 | 17.21% | 23.76MiB | 829.22KiB | 16.87KiB | 0 | 0.0340686 | 23.76MiB | 834.22KiB | 16.97KiB | 0 | 0.0342815 | False | False |
enterprise_http_to_http | -3.71KiB | -0.02 | 38.83% | 23.85MiB | 251.4KiB | 5.13KiB | 0 | 0.010293 | 23.84MiB | 253.88KiB | 5.19KiB | 0 | 0.0103961 | False | False |
file_to_blackhole | -35.65KiB | -0.04 | 31.40% | 95.34MiB | 3.18MiB | 66.0KiB | 0 | 0.0333852 | 95.3MiB | 2.81MiB | 58.5KiB | 0 | 0.0295081 | False | False |
http_to_http_json | -41.32KiB | -0.17 | 99.80% | 23.84MiB | 345.71KiB | 7.06KiB | 0 | 0.0141567 | 23.8MiB | 556.12KiB | 11.34KiB | 0 | 0.0228111 | False | False |
fluent_elasticsearch | -220.99KiB | -0.27 | 99.77% | 79.47MiB | 52.45KiB | 1.06KiB | 0 | 0.000644356 | 79.26MiB | 3.52MiB | 72.48KiB | 0 | 0.0444091 | False | False |
http_to_http_noack | -86.2KiB | -0.35 | 99.94% | 23.83MiB | 515.08KiB | 10.52KiB | 0 | 0.0211056 | 23.74MiB | 1.1MiB | 22.88KiB | 0 | 0.0462045 | False | False |
socket_to_socket_blackhole | -609.89KiB | -2.54 | 100.00% | 23.42MiB | 381.89KiB | 7.8KiB | 0 | 0.015919 | 22.83MiB | 350.6KiB | 7.16KiB | 0 | 0.0149959 | False | False |
datadog_agent_remap_datadog_logs_acks | -4.59MiB | -7.44 | 100.00% | 61.72MiB | 3.55MiB | 74.18KiB | 0 | 0.0575641 | 57.13MiB | 6.92MiB | 144.08KiB | 0 | 0.121131 | False | False |
I believe this PR is now ready for a more proper review. I have updated the cue docs and swapped out the naming from unit
to mode
.
Hi @jutley,
We discussed this contribution during our team check-in. We are excited for this transform to receive this feature. One concern we have with this PR, is that the requirement to serialize the data to count the bytes can introduce a significant, and unexpected overhead to this transform.
While our concern is based on assumptions, those assumptions are based on our experiences dealing withβ and past benchmarking of event serialization in the hot path.
One potential solution we could see is using work that is currently in-flight (and should land this week) to estimate the JSON encoded size of an event, with minimal overhead.
There are two downsides to this:
- As mentioned, it is an estimation (for performance reasons, this implementation does not count added escape characters, or replaced invalid UTF-8 byte sequences during serialization).
- It only works for estimating the JSON encoded size of an event, no other format is supported.
We were wondering if any of these two downsides would be an issue for your use-case? If not, we would probably prefer that direction for this PR, while keeping the possibility open (thus, baking this into the config API) to add exact byte size counting in the future, if enough users have a need for it, and accept the trade-off of the reduced throughput.
Additionally, we could introduce a new soak test in this PR, that takes an existing test, and adds the throttle transform in the middle, using byte-size throttling, to measure the impact on the overall throughput.
@JeanMertz These ideas sound totally reasonable. Using a true encoder has always felt simultaneously elegant and incredibly clumsy to me. I'll wait for this encoding estimation work to land and then continue from there.
I think the direction you are suggesting will work. That said, there is a bit of flexibility that the encoding provides that I was planning on exploring. In particular, I think it will be useful to be able to choose the fields to use in the encoding, or to choose the text
encoding to get just the message
field. I'm not sure what options the JSON encoding estimation will provide.
If this feature doesn't offer much configuration, I may look to implement different methods for getting a byte count for an event. If nothing else, I think it'll be useful (at least for me) to support the full JSON and just the message.
Deploy Preview for vrl-playground ready!
Name | Link |
---|---|
Latest commit | 4399f352c343517ae82e0c02ec6209f6e025552f |
Latest deploy log | https://app.netlify.com/sites/vrl-playground/deploys/63e3d80909541f0008eabe98 |
Deploy Preview | https://deploy-preview-14280--vrl-playground.netlify.app |
Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify site settings.
@JeanMertz I managed to find a little time to get back to this PR. I rewrote it to use the estimated json size, and am also supporting the bytes of the message
field. I haven't written a soak test, but I can figure it out if you think it is necessary.
@JeanMertz anything I can do to help move this along?
Regression Test Results
Run ID: 9c2dc1af-ed9f-4eff-b0cf-ef77391eb92f
Baseline: 50933e93a86a9e580bc02206c90e4c977d8233a0
Comparison: f0d84f007f2994f08161becf4cece5bf9c312970
Total vector
CPUs: 7
Explanation
A regression test is an integrated performance test for vector
in
a repeatable rig, with varying configuration for vector
. What
follows is a statistical summary of a brief vector
run for each
configuration across SHAs given above. The goal of these tests are to determine,
quickly, if vector
performance is changed and to what degree by a
pull request. Where appropriate units are scaled per-core.
The table below, if present, lists those experiments that have experienced a
statistically significant change in their bytes_written_per_cpu_second
performance
between baseline and comparison SHAs, with 90.0%
confidence OR have been detected as newly erratic. Negative values mean that
baseline is faster, positive comparison. Results that do not exhibit more than a
Β±5% change in mean bytes_written_per_cpu_second
are
discarded. An experiment is erratic if its coefficient of variation is greater
than 0.1. The abbreviated table will be
omitted if no interesting changes are observed.
No interesting changes in bytes_written_per_cpu_second
with confidence β₯ 90.00% and absolute Ξ mean >= Β±5%.
Fine details of change detection per experiment.
experiment | Ξ mean | Ξ mean % | confidence | baseline mean | baseline stdev | baseline stderr | baseline outlier % | baseline CoV | comparison mean | comparison stdev | comparison stderr | comparison outlier % | comparison CoV | erratic | declared erratic |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
syslog_regex_logs2metric_ddmetrics | 108.83KiB/CPU-s | 2.96 | 100.00% | 3.59MiB/CPU-s | 487.13KiB/CPU-s | 6.29KiB/CPU-s | 0.0 | 0.132447 | 3.7MiB/CPU-s | 461.85KiB/CPU-s | 5.96KiB/CPU-s | 0.0 | 0.121964 | True | False |
syslog_log2metric_humio_metrics | 175.77KiB/CPU-s | 2.8 | 100.00% | 6.13MiB/CPU-s | 191.95KiB/CPU-s | 2.48KiB/CPU-s | 0.0 | 0.030564 | 6.3MiB/CPU-s | 101.11KiB/CPU-s | 1.31KiB/CPU-s | 0.0 | 0.015661 | False | False |
socket_to_socket_blackhole | 298.14KiB/CPU-s | 2.25 | 100.00% | 12.91MiB/CPU-s | 426.64KiB/CPU-s | 5.51KiB/CPU-s | 0.0 | 0.032264 | 13.2MiB/CPU-s | 354.21KiB/CPU-s | 4.58KiB/CPU-s | 0.0 | 0.026196 | False | False |
http_text_to_http_json | 366.06KiB/CPU-s | 1.54 | 100.00% | 23.18MiB/CPU-s | 946.07KiB/CPU-s | 12.21KiB/CPU-s | 0.0 | 0.039855 | 23.54MiB/CPU-s | 772.67KiB/CPU-s | 9.97KiB/CPU-s | 0.0 | 0.032056 | False | False |
http_to_http_acks | 72.62KiB/CPU-s | 1.36 | 84.23% | 5.23MiB/CPU-s | 2.73MiB/CPU-s | 36.07KiB/CPU-s | 0.0 | 0.52148 | 5.3MiB/CPU-s | 2.77MiB/CPU-s | 36.62KiB/CPU-s | 0.0 | 0.522565 | True | False |
splunk_hec_route_s3 | 158.25KiB/CPU-s | 1.36 | 100.00% | 11.4MiB/CPU-s | 619.44KiB/CPU-s | 7.99KiB/CPU-s | 0.0 | 0.053067 | 11.55MiB/CPU-s | 671.22KiB/CPU-s | 8.66KiB/CPU-s | 0.0 | 0.056734 | False | False |
otlp_grpc_to_blackhole | 13.41KiB/CPU-s | 1.29 | 100.00% | 1.02MiB/CPU-s | 45.41KiB/CPU-s | 600.2B/CPU-s | 0.0 | 0.043545 | 1.03MiB/CPU-s | 49.67KiB/CPU-s | 656.36B/CPU-s | 0.0 | 0.047027 | False | False |
syslog_humio_logs | 118.22KiB/CPU-s | 1.28 | 100.00% | 9.03MiB/CPU-s | 303.06KiB/CPU-s | 3.91KiB/CPU-s | 0.0 | 0.032762 | 9.15MiB/CPU-s | 160.11KiB/CPU-s | 2.07KiB/CPU-s | 0.0 | 0.01709 | False | False |
syslog_splunk_hec_logs | 73.06KiB/CPU-s | 0.78 | 100.00% | 9.09MiB/CPU-s | 116.54KiB/CPU-s | 1.5KiB/CPU-s | 0.0 | 0.012518 | 9.16MiB/CPU-s | 229.71KiB/CPU-s | 2.96KiB/CPU-s | 0.0 | 0.02448 | False | False |
datadog_agent_remap_datadog_logs | 183.01KiB/CPU-s | 0.53 | 100.00% | 33.91MiB/CPU-s | 1.35MiB/CPU-s | 17.83KiB/CPU-s | 0.0 | 0.039805 | 34.09MiB/CPU-s | 1.22MiB/CPU-s | 16.12KiB/CPU-s | 0.0 | 0.035792 | False | False |
syslog_log2metric_splunk_hec_metrics | 45.71KiB/CPU-s | 0.47 | 100.00% | 9.41MiB/CPU-s | 137.63KiB/CPU-s | 1.78KiB/CPU-s | 0.0 | 0.01428 | 9.46MiB/CPU-s | 275.34KiB/CPU-s | 3.56KiB/CPU-s | 0.0 | 0.028434 | False | False |
http_to_http_json | 6.98KiB/CPU-s | 0.05 | 90.39% | 13.61MiB/CPU-s | 239.51KiB/CPU-s | 3.09KiB/CPU-s | 0.0 | 0.017182 | 13.62MiB/CPU-s | 219.95KiB/CPU-s | 2.84KiB/CPU-s | 0.0 | 0.015771 | False | False |
splunk_hec_indexer_ack_blackhole | 3.9KiB/CPU-s | 0.03 | 60.87% | 13.62MiB/CPU-s | 262.23KiB/CPU-s | 3.38KiB/CPU-s | 0.0 | 0.018806 | 13.62MiB/CPU-s | 235.2KiB/CPU-s | 3.03KiB/CPU-s | 0.0 | 0.016863 | False | False |
splunk_hec_to_splunk_hec_logs_noack | 3.58KiB/CPU-s | 0.03 | 65.68% | 13.62MiB/CPU-s | 220.87KiB/CPU-s | 2.85KiB/CPU-s | 0.0 | 0.015833 | 13.63MiB/CPU-s | 191.71KiB/CPU-s | 2.48KiB/CPU-s | 0.0 | 0.01374 | False | False |
file_to_blackhole | 8.6KiB/CPU-s | 0.02 | 31.77% | 54.49MiB/CPU-s | 1.12MiB/CPU-s | 14.85KiB/CPU-s | 0.0 | 0.020623 | 54.5MiB/CPU-s | 1.13MiB/CPU-s | 14.87KiB/CPU-s | 0.0 | 0.020662 | False | False |
enterprise_http_to_http | 2.02KiB/CPU-s | 0.01 | 48.03% | 13.62MiB/CPU-s | 190.24KiB/CPU-s | 2.46KiB/CPU-s | 0.0 | 0.013637 | 13.62MiB/CPU-s | 151.78KiB/CPU-s | 1.96KiB/CPU-s | 0.0 | 0.010879 | False | False |
fluent_elasticsearch | -100.71B/CPU-s | -0.0 | 14.31% | 45.41MiB/CPU-s | 30.03KiB/CPU-s | 392.35B/CPU-s | 0.0 | 0.000646 | 45.41MiB/CPU-s | 30.42KiB/CPU-s | 397.54B/CPU-s | 0.0 | 0.000654 | False | False |
http_to_http_noack | 118.85B/CPU-s | 0.0 | 1.65% | 13.61MiB/CPU-s | 309.13KiB/CPU-s | 3.99KiB/CPU-s | 0.0 | 0.022173 | 13.61MiB/CPU-s | 306.43KiB/CPU-s | 3.95KiB/CPU-s | 0.0 | 0.021979 | False | False |
splunk_hec_to_splunk_hec_logs_acks | -1.02KiB/CPU-s | -0.01 | 12.85% | 13.62MiB/CPU-s | 343.33KiB/CPU-s | 4.43KiB/CPU-s | 0.0 | 0.024621 | 13.62MiB/CPU-s | 348.41KiB/CPU-s | 4.5KiB/CPU-s | 0.0 | 0.024987 | False | False |
datadog_agent_remap_blackhole | -78.45KiB/CPU-s | -0.25 | 99.85% | 30.55MiB/CPU-s | 1.12MiB/CPU-s | 14.78KiB/CPU-s | 0.0 | 0.036576 | 30.47MiB/CPU-s | 1.5MiB/CPU-s | 19.82KiB/CPU-s | 0.0 | 0.049214 | False | False |
datadog_agent_remap_datadog_logs_acks | -127.75KiB/CPU-s | -0.37 | 100.00% | 33.58MiB/CPU-s | 1.28MiB/CPU-s | 16.97KiB/CPU-s | 0.0 | 0.038229 | 33.46MiB/CPU-s | 1019.24KiB/CPU-s | 13.16KiB/CPU-s | 0.0 | 0.029748 | False | False |
datadog_agent_remap_blackhole_acks | -428.67KiB/CPU-s | -1.35 | 100.00% | 30.95MiB/CPU-s | 1.18MiB/CPU-s | 15.58KiB/CPU-s | 0.0 | 0.03807 | 30.53MiB/CPU-s | 1.26MiB/CPU-s | 16.59KiB/CPU-s | 0.0 | 0.041103 | False | False |
syslog_loki | -127.98KiB/CPU-s | -1.41 | 100.00% | 8.87MiB/CPU-s | 236.6KiB/CPU-s | 3.05KiB/CPU-s | 0.0 | 0.026059 | 8.74MiB/CPU-s | 324.56KiB/CPU-s | 4.19KiB/CPU-s | 0.0 | 0.036258 | False | False |
otlp_http_to_blackhole | -23.69KiB/CPU-s | -1.51 | 100.00% | 1.53MiB/CPU-s | 115.02KiB/CPU-s | 1.48KiB/CPU-s | 0.0 | 0.073357 | 1.51MiB/CPU-s | 120.06KiB/CPU-s | 1.55KiB/CPU-s | 0.0 | 0.077744 | False | False |
@jszwedko That's a great idea. Seems like an obvious choice now that its been said! I think I can figure out the implementation by referencing what already exists. I'll reach out if I need help!
@jszwedko I started working on this (slowly), and I am running into a limitation with the RateLimiters. If one RateLimiter allows the message but the next RateLimiter does not allow the message, then the message should get dropped. However, in checking this, the message still counted against the first RateLimiter.
Looking at the RateLimiter docs, I'm not seeing any kind of dry-run functionality. Any suggestions?
@jszwedko I started working on this (slowly), and I am running into a limitation with the RateLimiters. If one RateLimiter allows the message but the next RateLimiter does not allow the message, then the message should get dropped. However, in checking this, the message still counted against the first RateLimiter.
Looking at the RateLimiter docs, I'm not seeing any kind of dry-run functionality. Any suggestions?
Aha, yes, you appear to be right. There isn't a way to check the quota without consuming it or to replace tokens outside of the refresh interval π€ Given that, I'm not seeing an easy way to maintain multiple quotas for the same input stream. I opened https://github.com/antifuchs/governor/issues/167 on the upstream crate to see if there is something we are missing.
Absent that support, I think the best we can do is something like the current implementation where you can only choose one of the characteristics to throttle on.
However! I was also thinking of a potentially more flexible interface for this though which would be to allow providing a VRL expression to express how many tokens are used by a given incoming event. The interface for this would be something like:
[transforms.foo]
type = "throttle"
key_field = "service"
window_secs = 60
threshold = 1000
tokens = "1" # the default, take one token for each event
# or
tokens = "len(.message)" # take N tokens based on the length of the message
# or
tokens = "len(encode_json(.))" # take N tokens based on the JSON-encoded length of the message
# or
tokens = """
if .status == "500" {
0
} else {
1
}
""" # for 500s take a different amount of tokens
The JSON one would be rather expensive since it involves encoding and I know that was the motivation for using the estimated JSON encoding in this PR. I think we could resolve that by adding a new estimated_json_size()
method to VRL that returns the estimated size.
This also again has the advantage of having threshold
make sense as just a "number of tokens" rather than having it represent different units depending on the mode.
That approach makes sense. I suspect that if someone wants to use the "multi-mode" approach, they can chain throttle transforms together. That should allow us to guarantee something like "no more then x events and y bytes per second per key". There are probably some subtle mathematical differences between the chained approach and the multi-mode approach, but I suspect this isn't going to be particularly important for most real world scenarios. We're dropping data intentionally either way.
It'll take me a little longer to get around to exploring this approach. I think the estimated_json_size
method is completely orthogonal to this throttle tokens effort, so I'll not worry about that piece.
It'll take me a little longer to get around to exploring this approach. I think the
estimated_json_size
method is completely orthogonal to this throttle tokens effort, so I'll not worry about that piece.
Yeah, agreed, we can split that bit off into its own PR since it'll involve an addition to VRL.
Looks like there was a governor feature request that I missed that discusses our exact issue: https://github.com/antifuchs/governor/issues/156
Just leaving a note to say that I took a look at the recommendation approach, and while I do believe it should be possible, I am just not proficient enough in Rust to handle this in a timely manner. Small things take me a long time, and I just don't have the time for this.
If anyone else is interested in picking this up, feel free!