vector
vector copied to clipboard
feat(vrl): Add `parse_cef` function
Closes #6451.
Follows CEF version 26.
Open questions
There are two extensions for this parser that I'm unsure if they should be included now, later, or at all:
- [x] Translate CEF Key Names to Full Name. Example: "act" to "deviceAction", (EDIT: Not at all.)
- [x] Construct key-value from key label fields. Example: (EDIT: A separate issue. )
{
"c6a1": "value1",
"c6a1Label": "key1"
}
to
{
"key1": "value1",
}
Deploy Preview for vector-project canceled.
| Name | Link |
|---|---|
| Latest commit | 4b4e2086a2d74969faed12b0e70cd11361963eee |
| Latest deploy log | https://app.netlify.com/sites/vector-project/deploys/63348648f2b2c10009800c9b |
Soak Test Results
Baseline: 28113af2bf357c71957bf377225dc9df2415cf8f Comparison: 4886c19003a72e7af86fe30dbca89a4f59918dc6 Total Vector CPUs: 4
Explanation
A soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core.
The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed.
No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%:
Fine details of change detection per experiment.
| experiment | Δ mean | Δ mean % | confidence | baseline mean | baseline stdev | baseline stderr | baseline outlier % | baseline CoV | comparison mean | comparison stdev | comparison stderr | comparison outlier % | comparison CoV | erratic | declared erratic |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| datadog_agent_remap_blackhole_acks | 2.13MiB | 3.71 | 100.00% | 57.38MiB | 4.41MiB | 91.72KiB | 0 | 0.076762 | 59.5MiB | 3.46MiB | 72.52KiB | 0 | 0.0581851 | False | False |
| datadog_agent_remap_blackhole | 1.25MiB | 2.22 | 100.00% | 56.25MiB | 3.76MiB | 78.25KiB | 0 | 0.0667581 | 57.5MiB | 3.94MiB | 82.06KiB | 0 | 0.0684304 | False | False |
| http_text_to_http_json | 699.3KiB | 1.82 | 100.00% | 37.53MiB | 900.93KiB | 18.39KiB | 0 | 0.0234363 | 38.22MiB | 852.95KiB | 17.41KiB | 0 | 0.0217917 | False | False |
| splunk_hec_route_s3 | 260.84KiB | 1.38 | 99.99% | 18.46MiB | 2.37MiB | 49.27KiB | 0 | 0.128177 | 18.72MiB | 2.17MiB | 45.32KiB | 0 | 0.115719 | False | False |
| http_pipelines_blackhole_acks | 14.44KiB | 1.14 | 100.00% | 1.23MiB | 112.01KiB | 2.28KiB | 0 | 0.0886288 | 1.25MiB | 80.21KiB | 1.63KiB | 0 | 0.0627486 | False | False |
| syslog_humio_logs | 123.45KiB | 0.74 | 100.00% | 16.3MiB | 179.6KiB | 3.67KiB | 0 | 0.0107565 | 16.42MiB | 155.79KiB | 3.19KiB | 0 | 0.00926216 | False | False |
| datadog_agent_remap_datadog_logs_acks | 444.2KiB | 0.74 | 99.99% | 58.97MiB | 3.24MiB | 67.66KiB | 0 | 0.0548727 | 59.41MiB | 4.44MiB | 92.42KiB | 0 | 0.074717 | False | False |
| http_to_http_acks | 119.39KiB | 0.67 | 37.96% | 17.37MiB | 8.06MiB | 168.48KiB | 0 | 0.463991 | 17.49MiB | 8.26MiB | 172.41KiB | 0 | 0.472311 | True | True |
| socket_to_socket_blackhole | 154.01KiB | 0.66 | 100.00% | 22.67MiB | 409.17KiB | 8.35KiB | 0 | 0.0176232 | 22.82MiB | 178.5KiB | 3.64KiB | 0 | 0.00763762 | False | False |
| syslog_regex_logs2metric_ddmetrics | 51.0KiB | 0.41 | 99.65% | 12.28MiB | 651.84KiB | 13.28KiB | 0 | 0.0518225 | 12.33MiB | 556.77KiB | 11.35KiB | 0 | 0.0440856 | False | False |
| syslog_log2metric_humio_metrics | 51.53KiB | 0.4 | 99.98% | 12.5MiB | 281.42KiB | 5.74KiB | 0 | 0.0219787 | 12.55MiB | 606.7KiB | 12.35KiB | 0 | 0.0471939 | False | False |
| syslog_splunk_hec_logs | 16.58KiB | 0.1 | 63.62% | 16.34MiB | 739.47KiB | 15.05KiB | 0 | 0.044195 | 16.35MiB | 506.32KiB | 10.34KiB | 0 | 0.030231 | False | False |
| splunk_hec_to_splunk_hec_logs_noack | 9.65KiB | 0.04 | 61.28% | 23.83MiB | 435.79KiB | 8.9KiB | 0 | 0.0178572 | 23.84MiB | 330.24KiB | 6.74KiB | 0 | 0.0135269 | False | False |
| splunk_hec_indexer_ack_blackhole | 4.48KiB | 0.02 | 13.32% | 23.74MiB | 930.31KiB | 18.92KiB | 0 | 0.0382638 | 23.74MiB | 924.64KiB | 18.8KiB | 0 | 0.0380238 | False | False |
| enterprise_http_to_http | 717.05B | 0 | 7.61% | 23.84MiB | 253.34KiB | 5.17KiB | 0 | 0.0103733 | 23.85MiB | 253.94KiB | 5.2KiB | 0 | 0.0103977 | False | False |
| splunk_hec_to_splunk_hec_logs_acks | -2.46KiB | -0.01 | 7.65% | 23.75MiB | 889.03KiB | 18.08KiB | 0 | 0.0365475 | 23.75MiB | 892.55KiB | 18.15KiB | 0 | 0.0366962 | False | False |
| file_to_blackhole | -17.27KiB | -0.02 | 11.71% | 95.33MiB | 3.91MiB | 80.97KiB | 0 | 0.0409638 | 95.31MiB | 4.08MiB | 84.82KiB | 0 | 0.0427583 | False | False |
| http_to_http_json | -30.03KiB | -0.12 | 98.76% | 23.85MiB | 333.03KiB | 6.8KiB | 0 | 0.0136332 | 23.82MiB | 484.08KiB | 9.89KiB | 0 | 0.0198411 | False | False |
| datadog_agent_remap_datadog_logs | -107.25KiB | -0.17 | 71.68% | 60.38MiB | 1.92MiB | 40.16KiB | 0 | 0.0317461 | 60.28MiB | 4.39MiB | 91.5KiB | 0 | 0.0728925 | False | False |
| http_pipelines_no_grok_blackhole | -22.35KiB | -0.21 | 68.81% | 10.62MiB | 139.1KiB | 2.84KiB | 0 | 0.0127897 | 10.6MiB | 1.05MiB | 21.92KiB | 0 | 0.0993108 | False | False |
| fluent_elasticsearch | -170.99KiB | -0.21 | 100.00% | 79.47MiB | 53.0KiB | 1.07KiB | 0 | 0.000651135 | 79.31MiB | 1.56MiB | 32.08KiB | 0 | 0.019653 | False | False |
| http_to_http_noack | -77.3KiB | -0.32 | 99.98% | 23.85MiB | 254.72KiB | 5.21KiB | 0 | 0.0104294 | 23.77MiB | 982.23KiB | 20.01KiB | 0 | 0.0403448 | False | False |
| http_pipelines_blackhole | -11.65KiB | -0.66 | 100.00% | 1.73MiB | 10.96KiB | 229.29B | 0 | 0.0061985 | 1.72MiB | 122.81KiB | 2.5KiB | 0 | 0.0699126 | False | False |
| syslog_loki | -114.94KiB | -0.76 | 100.00% | 14.71MiB | 379.75KiB | 7.77KiB | 0 | 0.0252129 | 14.59MiB | 738.83KiB | 15.02KiB | 0 | 0.0494313 | False | False |
| syslog_log2metric_splunk_hec_metrics | -252.8KiB | -1.47 | 100.00% | 16.82MiB | 890.69KiB | 18.16KiB | 0 | 0.0517027 | 16.57MiB | 1.04MiB | 21.68KiB | 0 | 0.0628038 | False | False |
Translate CEF Key Names to Full Name. Example: "act" to "deviceAction"
I would not do that. *act", in this example, is the short name as defined in the CEF docs which is ok to work with. Changing all short names to full names (as per the docs) has little value IMHO. Chances are that you will rename the keys anyways.
Construct key-value from key label fields.
This could be interesting and help working with CEF logs. I would make it optional though as one might not need it.
I'm looking at the CEF v26 specification from here: https://www.microfocus.com/documentation/arcsight/arcsight-smartconnectors-8.3/pdfdoc/cef-implementation-standard/cef-implementation-standard.pdf
I'm not very familiar with this format, but the documentation seems to imply the most common format is with a syslog prefix, which the current implementation doesn't support. It would be good to support that, or explain why it's not needed.
For example, I expected the following example to parse correctly
Sep 29 08:26:10 host CEF:1|Security|threatmanager|1.0|100|worm successfully
stopped|10|src=10.0.0.1 dst=2.1.2.2 spt=1232
I'm looking at the CEF v26 specification from here: https://www.microfocus.com/documentation/arcsight/arcsight-smartconnectors-8.3/pdfdoc/cef-implementation-standard/cef-implementation-standard.pdf
I'm not very familiar with this format, but the documentation seems to imply the most common format is with a syslog prefix, which the current implementation doesn't support. It would be good to support that, or explain why it's not needed.
For example, I expected the following example to parse correctly
Sep 29 08:26:10 host CEF:1|Security|threatmanager|1.0|100|worm successfully stopped|10|src=10.0.0.1 dst=2.1.2.2 spt=1232
@fuchsnj shouldn't one use the syslog source (or parse_syslog) in that case to parse the syslog part of the message and use parse_cef for the CEF part ?
@fuchsnj shouldn't one use the syslog source (or parse_syslog) in that case to parse the syslog part of the message and use parse_cef for the CEF part ?
It's just a syslog prefix as part of the CEF header. It's not a full syslog message (it won't parse correctly with parse_syslog, and even if it did, you wouldn't get the remainder to pass to parse_cef). I think it would be acceptable to skip everything before the CEF:Version header.
It's not a full syslog message (it won't parse correctly with parse_syslog, and even if it did, you wouldn't get the remainder to pass to parse_cef).
That is an issue. Then, we could try to parse syslog prefix else, if we fail, discard everything up to CEF header.
@fuchsnj shouldn't one use the syslog source (or parse_syslog) in that case to parse the syslog part of the message and use parse_cef for the CEF part ?
It's just a syslog prefix as part of the CEF header. It's not a full syslog message (it won't parse correctly with
parse_syslog, and even if it did, you wouldn't get the remainder to pass toparse_cef). I think it would be acceptable to skip everything before theCEF:Versionheader.
I am working with CEF log files... what I am seeing is two variants:
- no syslog parts, basically only CEF over
- normal syslog line with as message the CEF part
parse_syslog, and even if it did, you wouldn't get the remainder to pass toparse_cef
according to https://vector.dev/docs/reference/vrl/functions/#parse_syslog, you would get a message, which is the original line minus the syslog part... so you would pass the message to parse_cef, at least that's what I had in mind ... CMIIAW
I think it would be acceptable to skip everything before the CEF:Version header.
I would opt for that ... should I care for the "syslog prefix" as well, I could parse it somehow else (e.g. grok etc) and use the CEF part to pass to parse_cef ... or just have parse_cef ignore everything before CEF:Version as you say.
my 2c
First of all, thank you very much for implementing/working on this function. 🎉 A number of our pipelines require to parse CEF data and so far we are just using VRL for this, creating boilerplate. We had in our notes to actually request this feature so this PR and initiative is highly appreciated.
Similar to @sim0nx , we also have to deal with different formats for "CEF-over-Syslog" data. Examples:
May 31 10:01:02 hostname CEF:0|Zscaler|NSSWeblog|5.0|Allowed|...
<13>Jun 4 10:29:58 hostname CEF:0|Palo Alto Networks|PAN-OS|...
In our case, they differ in the inclusion of the syslog PRIO header.
For both cases, in VRL, we first apply parse_syslog() to obtain the message part and then the following parse-cef transform that we made ourselves for the purpose:
host = string!(.host)
message = string!(.message)
fields = split(message, "|", limit: 8)
if length(fields) < 8 {
log("invalid CEF message from <" + host + ">: " + message, level: "warn")
abort
}
. = {
"host": host,
"version": to_int!(fields[0]),
"device_vendor": fields[1],
"device_product": fields[2],
"device_version": fields[3],
"device_event_class_id": fields[4],
"name": fields[5],
"severity": fields[6],
"extension": fields[7],
}
The above is just parsing the CEF format itself strictly.
To parse the rest, we made a second-stage parse-XYZ transform for the extension field itself.
As you can imagine, there are huge regexes, boilerplate and duplicated code across lot of our pipelines.
For example:
host = string!(.host)
extension = string!(.extension)
., err = parse_regex(extension, r'^act=(?P<eventAction>[\w\d\s./_-]+) ') # can't share the full regex :(
if err != null {
log("failed to parse zscaler CEF extension from <" + host + ">: " + extension, level: "warn")
abort
}
The example you show in Construct key-value from key label fields looks VERY valuable and useful to us :)
Hope the above gives some light on real-world use-cases for your proposed parse_cef() function.
Thanks @sim0nx @hhromic.
Then we can add two following modifications:
- When parsing discard everything up to
CEF:Version. -- As @sim0nx said, with this parsing will just work regardless if it's embedded in syslog or not. And if that prefix is useful it can be parsed out with syslog parser. The parser would just work as @fuchsnj expected. - Construct key-value from key label fields. -- This seems to be useful and it the domain of this parser. Also I wouldn't make it optional since this mechanism is defined in the specification as a way to have custom keys in CEF, to avoid its limitations. And since the output is no longer CEF the limitation is no longer present, so there is no reason for that anymore. At least until a real world case is presented.
@ktff sounds good to me!
- Construct key-value from key label fields. -- This seems to be useful and it the domain of this parser. Also I wouldn't make it optional since this mechanism is defined in the specification as a way to have custom keys in CEF, to avoid its limitations. And since the output is no longer CEF the limitation is no longer present, so there is no reason for that anymore. At least until a real world case is presented.
I agree with that too. We already feel very excited to get this type of transformation "for free" in this function:
{"c6a1":"value1","c6a1Label":"key1"} -> {"key1":"value1"}
While it should be rare, what would happen if multiple X + XLabel pairs share the same key?
It would be constructing an array in that case? For example:
{"c6a1":"value1","c6a1Label":"key1","c5a1":"value2","c5a1Label":"key1"} -> {"key1":["value1","value2"]}
I think other functions in VRL already do like the above. So it would be nice for consistency.
@hhromic that seems like really rare for CEF and a bit hacky way to transmit an array, so I'm not sure if it should be supported. While just silently dropping the field isn't ideal either. So instead of that, on collision we can not perform transformation for that pair.
@ktff yes, definitively CEF (on paper) should not have duplicated XLabel values for different X. And definitively I don't think CEF would ever intend to transmit arrays either in that way "natively".
I say "on paper" because our team has seen a vast amount of CEF data from different devices/software during years and I'm not sure you can imagine the ugly formatting horrors that appear on the wild. That's why I was asking how the proposed parse_cef() function would handle such a rare but not impossible situation.
So instead of that, on collision we can not perform transformation for that pair.
I think that would cause more headaches than solutions in the long run. If data ever happens to come with duplicated labels, I think is easier to handle in an array (after conversion) than leaving "as-is". Especially because it would be harder to detect "as-is".
Just in case it was not clear, the "to-array" logic would only use an array output if-and-only-if there are duplicate keys found during parsing. Otherwise the field value remains a simple string. Eg:
{"c6a1":"value1","c6a1Label":"key1","c5a1":"value2","c5a1Label":"key1"} -> {"key1":["value1","value2"]}
{"c6a1":"value1","c6a1Label":"key1","c5a1":"value2","c5a1Label":"key2"} -> {"key1":"value1","key2":"value2"}
In this way, in the vast majority of cases there will never be arrays in the values, except if a device decides to duplicate labels.
I can't remember right now which other function in VRL behaves like this, but tomorrow I will check the docs and report back.
@ktff here are at least two other functions in VRL that behave as described above:
$ parse_key_value!("key1=value1")
{ "key1": "value1" }
$ parse_key_value!("key1=value1 key1=value2")
{ "key1": ["value1", "value2"] }
$ parse_query_string("?key1=value1")
{ "key1": "value1" }
$ parse_query_string("?key1=value1&key1=value2")
{ "key1": ["value1", "value2"] }
@ktff here are at least two other functions in VRL that behave as described above:
$ parse_key_value!("key1=value1") { "key1": "value1" } $ parse_key_value!("key1=value1 key1=value2") { "key1": ["value1", "value2"] } $ parse_query_string("?key1=value1") { "key1": "value1" } $ parse_query_string("?key1=value1&key1=value2") { "key1": ["value1", "value2"] }
I think this would be a reasonable behavior to continue following - unless the spec is 100% that duplicate keys will never happen. We've also stuck pretty close to "implemented per spec, regardless of in-the-wild behavior doesn't always match that"
While it should be rare, what would happen if multiple X + XLabel pairs share the same key? It would be constructing an array in that case?
One downside to keep in mind, if duplicates are placed in arrays that means the VRL type definitions will always be string | array which means functions using the output of parse_cef that expect only strings as input will be fallible, or require coercing to a string first. So if there isn't a good reason to allow duplicates, it will be easier to work with if only 1 value is kept for each key.
We've also stuck pretty close to "implemented per spec, regardless of in-the-wild behavior doesn't always match that"
I also like to follow specs and standards as much as possible, but it is a shame that vendors sometimes don't :(
We like Vector a lot because it easily allows us to deal with bad data. The syslog parser in Vector is a good example of a very resilient parser that does not drop bad data after trying its best to parse malformed syslog.
One downside to keep in mind, if duplicates are placed in arrays that means the VRL type definitions will always be
string | arraywhich means functions using the output ofparse_cefthat expect only strings as input will be fallible, or require coercing to a string first. So if there isn't a good reason to allow duplicates, it will be easier to work with if only 1 value is kept for each key.
That is a very interesting point indeed.
I just checked and indeed parse_key_value() and parse_query_string() suffer the same caveat already.
I guess in the end it would be okay for parse_cef() to only process single-valued keys, as long as it doesn't discard incoming data because of this. Perhaps, the way of handling duplicate keys could be configured, pretty much like how it was discussed in this (still pending) PR: https://github.com/vectordotdev/vector/pull/11580#issuecomment-1055018531
We've also stuck pretty close to "implemented per spec, regardless of in-the-wild behavior doesn't always match that"
I also like to follow specs and standards as much as possible, but it is a shame that vendors sometimes don't :(
We like Vector a lot because it easily allows us to deal with bad data. The syslog parser in Vector is a good example of a very resilient parser that does not drop bad data after trying its best to parse malformed syslog.
Ah, I guess this wasn't fully thought out when I typed this. That's generally been the case on the inputs/outputs for Vector, to avoid bad/malformed data coming into your pipeline. Definitely want to keep the VRL handling flexible and robust.
Example being the syslog source rejects invalid messages when it fails to decode, and the case of "pseudo-syslog" - you can use the socket source and handle the decoding in VRL.
@hhromic parse_key_value and parse_query_string are different beasts, they don't have any key set so they need to be generic as possible. While, as @fuchsnj mentioned, when parsing valid CEF we can always return a string value since there are no duplicates, which is nice. But I do agree that
So instead of that, on collision we can not perform transformation for that pair.
isn't a good solution.
So let's after all make this translation a separate feature. It seems like it will require an option so that it's opt in. Once active it can change the type definition of the parser so that it returns string and array. Or at least that seems like a way to do it.
Soak Test Results
Baseline: d498040a770ae2bb5c9d25efce62acadcb17ee57 Comparison: 326838a0fdea56253010e431862b47455eb17d5c Total Vector CPUs: 4
Explanation
A soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core.
The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed.
No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%:
Fine details of change detection per experiment.
| experiment | Δ mean | Δ mean % | confidence | baseline mean | baseline stdev | baseline stderr | baseline outlier % | baseline CoV | comparison mean | comparison stdev | comparison stderr | comparison outlier % | comparison CoV | erratic | declared erratic |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| http_pipelines_blackhole | 46.63KiB | 2.86 | 100.00% | 1.59MiB | 88.55KiB | 1.81KiB | 0 | 0.054326 | 1.64MiB | 138.62KiB | 2.82KiB | 0 | 0.0826826 | False | False |
| syslog_loki | 209.87KiB | 1.43 | 100.00% | 14.33MiB | 244.61KiB | 5.01KiB | 0 | 0.0166715 | 14.53MiB | 747.16KiB | 15.19KiB | 0 | 0.0502059 | False | False |
| datadog_agent_remap_blackhole_acks | 263.01KiB | 0.42 | 99.10% | 60.95MiB | 4.17MiB | 86.78KiB | 0 | 0.0683613 | 61.2MiB | 2.44MiB | 51.08KiB | 0 | 0.0398841 | False | False |
| datadog_agent_remap_blackhole | 68.12KiB | 0.11 | 42.96% | 58.06MiB | 4.55MiB | 94.8KiB | 0 | 0.0784111 | 58.12MiB | 3.53MiB | 73.63KiB | 0 | 0.0606835 | False | False |
| splunk_hec_to_splunk_hec_logs_noack | 8.45KiB | 0.03 | 55.30% | 23.83MiB | 428.47KiB | 8.74KiB | 0 | 0.0175542 | 23.84MiB | 335.31KiB | 6.84KiB | 0 | 0.013733 | False | False |
| enterprise_http_to_http | -1.97KiB | -0.01 | 21.31% | 23.85MiB | 251.01KiB | 5.12KiB | 0 | 0.0102769 | 23.85MiB | 253.94KiB | 5.2KiB | 0 | 0.0103978 | False | False |
| splunk_hec_to_splunk_hec_logs_acks | -16.88KiB | -0.07 | 57.65% | 23.79MiB | 702.39KiB | 14.31KiB | 0 | 0.0288281 | 23.77MiB | 760.8KiB | 15.49KiB | 0 | 0.031247 | False | False |
| splunk_hec_indexer_ack_blackhole | -19.7KiB | -0.08 | 57.35% | 23.77MiB | 808.03KiB | 16.45KiB | 0 | 0.0331957 | 23.75MiB | 910.89KiB | 18.53KiB | 0 | 0.037452 | False | False |
| file_to_blackhole | -73.76KiB | -0.08 | 47.33% | 95.34MiB | 3.48MiB | 72.22KiB | 0 | 0.0365323 | 95.27MiB | 4.4MiB | 91.43KiB | 0 | 0.0461631 | False | False |
| splunk_hec_route_s3 | -14.96KiB | -0.08 | 16.99% | 18.13MiB | 2.39MiB | 49.72KiB | 0 | 0.131638 | 18.12MiB | 2.34MiB | 48.83KiB | 0 | 0.128866 | False | False |
| http_to_http_json | -24.89KiB | -0.1 | 95.81% | 23.84MiB | 356.93KiB | 7.29KiB | 0 | 0.0146172 | 23.82MiB | 480.54KiB | 9.82KiB | 0 | 0.0196993 | False | False |
| http_to_http_noack | -61.07KiB | -0.25 | 99.65% | 23.84MiB | 407.17KiB | 8.32KiB | 0 | 0.0166772 | 23.78MiB | 940.55KiB | 19.17KiB | 0 | 0.0386208 | False | False |
| syslog_log2metric_humio_metrics | -37.47KiB | -0.3 | 99.94% | 12.21MiB | 199.01KiB | 4.06KiB | 0 | 0.0159097 | 12.18MiB | 496.26KiB | 10.1KiB | 0 | 0.0397918 | False | False |
| fluent_elasticsearch | -383.53KiB | -0.47 | 100.00% | 79.47MiB | 55.64KiB | 1.12KiB | 0 | 0.000683536 | 79.1MiB | 4.1MiB | 84.24KiB | 0 | 0.0518146 | False | False |
| http_text_to_http_json | -212.58KiB | -0.54 | 100.00% | 38.34MiB | 874.56KiB | 17.85KiB | 0 | 0.0222714 | 38.13MiB | 869.93KiB | 17.76KiB | 0 | 0.0222742 | False | False |
| datadog_agent_remap_datadog_logs_acks | -572.38KiB | -0.92 | 100.00% | 60.92MiB | 3.21MiB | 67.09KiB | 0 | 0.0526934 | 60.36MiB | 4.28MiB | 89.14KiB | 0 | 0.0709264 | False | False |
| datadog_agent_remap_datadog_logs | -704.88KiB | -1.11 | 100.00% | 62.28MiB | 639.3KiB | 13.1KiB | 0 | 0.010023 | 61.59MiB | 4.26MiB | 88.72KiB | 0 | 0.0691786 | False | False |
| syslog_regex_logs2metric_ddmetrics | -172.63KiB | -1.34 | 100.00% | 12.54MiB | 508.1KiB | 10.36KiB | 0 | 0.0395612 | 12.37MiB | 440.41KiB | 8.98KiB | 0 | 0.034758 | False | False |
| syslog_splunk_hec_logs | -229.67KiB | -1.38 | 100.00% | 16.22MiB | 675.26KiB | 13.76KiB | 0 | 0.0406376 | 16.0MiB | 755.93KiB | 15.39KiB | 0 | 0.0461301 | False | False |
| http_pipelines_blackhole_acks | -17.85KiB | -1.43 | 100.00% | 1.22MiB | 110.64KiB | 2.25KiB | 0 | 0.0887391 | 1.2MiB | 84.53KiB | 1.72KiB | 0 | 0.0687789 | False | False |
| http_to_http_acks | -292.92KiB | -1.64 | 77.33% | 17.48MiB | 8.19MiB | 171.23KiB | 0 | 0.46865 | 17.19MiB | 8.21MiB | 171.4KiB | 0 | 0.477627 | True | True |
| syslog_humio_logs | -431.23KiB | -2.53 | 100.00% | 16.62MiB | 224.26KiB | 4.58KiB | 0 | 0.0131741 | 16.2MiB | 232.39KiB | 4.76KiB | 0 | 0.0140064 | False | False |
| syslog_log2metric_splunk_hec_metrics | -454.07KiB | -2.57 | 100.00% | 17.24MiB | 825.2KiB | 16.81KiB | 0 | 0.0467251 | 16.8MiB | 740.78KiB | 15.1KiB | 0 | 0.0430522 | False | False |
| http_pipelines_no_grok_blackhole | -301.92KiB | -2.77 | 100.00% | 10.65MiB | 336.27KiB | 6.86KiB | 0 | 0.0308161 | 10.36MiB | 1.1MiB | 22.97KiB | 0 | 0.106508 | False | False |
| socket_to_socket_blackhole | -677.6KiB | -2.9 | 100.00% | 22.79MiB | 624.03KiB | 12.74KiB | 0 | 0.0267291 | 22.13MiB | 534.12KiB | 10.9KiB | 0 | 0.0235622 | False | False |
Soak Test Results
Baseline: d498040a770ae2bb5c9d25efce62acadcb17ee57 Comparison: 7f0d88e8511dd0c98b28f71e531f2b42ef1ad275 Total Vector CPUs: 4
Explanation
A soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core.
The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed.
No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%:
Fine details of change detection per experiment.
| experiment | Δ mean | Δ mean % | confidence | baseline mean | baseline stdev | baseline stderr | baseline outlier % | baseline CoV | comparison mean | comparison stdev | comparison stderr | comparison outlier % | comparison CoV | erratic | declared erratic |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| syslog_loki | 127.0KiB | 0.9 | 100.00% | 13.77MiB | 411.87KiB | 8.42KiB | 0 | 0.0292098 | 13.89MiB | 714.19KiB | 14.52KiB | 0 | 0.0501988 | False | False |
| datadog_agent_remap_blackhole_acks | 405.99KiB | 0.66 | 99.93% | 59.83MiB | 4.78MiB | 99.5KiB | 0 | 0.0798753 | 60.23MiB | 3.22MiB | 67.26KiB | 0 | 0.0534066 | False | False |
| datadog_agent_remap_blackhole | 333.39KiB | 0.53 | 99.94% | 60.86MiB | 3.93MiB | 81.81KiB | 0 | 0.0645111 | 61.19MiB | 2.46MiB | 51.39KiB | 0 | 0.0402224 | False | False |
| http_pipelines_blackhole_acks | 4.17KiB | 0.34 | 86.66% | 1.19MiB | 115.36KiB | 2.35KiB | 0 | 0.0944111 | 1.2MiB | 73.0KiB | 1.49KiB | 0 | 0.0595414 | False | False |
| http_pipelines_blackhole | 2.72KiB | 0.16 | 71.61% | 1.69MiB | 43.22KiB | 903.84B | 0 | 0.0249883 | 1.69MiB | 116.89KiB | 2.38KiB | 0 | 0.0674695 | False | False |
| splunk_hec_to_splunk_hec_logs_noack | 15.73KiB | 0.06 | 81.76% | 23.82MiB | 470.56KiB | 9.61KiB | 0 | 0.0192869 | 23.84MiB | 335.41KiB | 6.85KiB | 0 | 0.0137387 | False | False |
| splunk_hec_to_splunk_hec_logs_acks | 15.34KiB | 0.06 | 46.56% | 23.75MiB | 881.87KiB | 17.93KiB | 0 | 0.0362608 | 23.76MiB | 834.05KiB | 16.97KiB | 0 | 0.0342727 | False | False |
| splunk_hec_indexer_ack_blackhole | 7.86KiB | 0.03 | 23.11% | 23.74MiB | 950.54KiB | 19.33KiB | 0 | 0.0390981 | 23.74MiB | 910.13KiB | 18.51KiB | 0 | 0.0374239 | False | False |
| enterprise_http_to_http | -1.19KiB | -0 | 13.13% | 23.85MiB | 248.57KiB | 5.07KiB | 0 | 0.0101775 | 23.85MiB | 249.36KiB | 5.1KiB | 0 | 0.0102102 | False | False |
| file_to_blackhole | -62.21KiB | -0.06 | 66.63% | 95.38MiB | 2.05MiB | 42.42KiB | 0 | 0.0214527 | 95.32MiB | 2.33MiB | 48.37KiB | 0 | 0.0243906 | False | False |
| http_to_http_json | -36.33KiB | -0.15 | 99.45% | 23.84MiB | 345.93KiB | 7.06KiB | 0 | 0.0141648 | 23.81MiB | 538.78KiB | 11.0KiB | 0 | 0.0220942 | False | False |
| fluent_elasticsearch | -219.59KiB | -0.27 | 100.00% | 79.47MiB | 53.43KiB | 1.08KiB | 0 | 0.000656466 | 79.26MiB | 2.55MiB | 52.45KiB | 0 | 0.0321784 | False | False |
| http_to_http_noack | -96.23KiB | -0.39 | 99.98% | 23.83MiB | 519.19KiB | 10.61KiB | 0 | 0.0212732 | 23.73MiB | 1.15MiB | 23.96KiB | 0 | 0.0484144 | False | False |
| syslog_log2metric_humio_metrics | -61.43KiB | -0.47 | 100.00% | 12.71MiB | 221.81KiB | 4.53KiB | 0 | 0.0170334 | 12.65MiB | 543.91KiB | 11.07KiB | 0 | 0.0419655 | False | False |
| datadog_agent_remap_datadog_logs | -506.62KiB | -0.81 | 100.00% | 61.17MiB | 279.83KiB | 5.73KiB | 0 | 0.00446658 | 60.67MiB | 3.95MiB | 82.36KiB | 0 | 0.0651559 | False | False |
| syslog_regex_logs2metric_ddmetrics | -131.53KiB | -1.03 | 100.00% | 12.52MiB | 635.63KiB | 12.95KiB | 0 | 0.0495824 | 12.39MiB | 509.03KiB | 10.38KiB | 0 | 0.0401185 | False | False |
| syslog_splunk_hec_logs | -211.83KiB | -1.28 | 100.00% | 16.12MiB | 881.87KiB | 17.95KiB | 0 | 0.0534003 | 15.92MiB | 865.95KiB | 17.63KiB | 0 | 0.0531176 | False | False |
| http_text_to_http_json | -615.72KiB | -1.57 | 100.00% | 38.42MiB | 816.69KiB | 16.67KiB | 0 | 0.0207568 | 37.81MiB | 1.16MiB | 24.15KiB | 0 | 0.0305482 | False | False |
| splunk_hec_route_s3 | -298.64KiB | -1.62 | 100.00% | 18.04MiB | 2.33MiB | 48.53KiB | 0 | 0.129137 | 17.75MiB | 2.3MiB | 48.03KiB | 0 | 0.129347 | False | False |
| http_pipelines_no_grok_blackhole | -283.41KiB | -2.53 | 100.00% | 10.95MiB | 63.77KiB | 1.3KiB | 0 | 0.00568599 | 10.67MiB | 1.03MiB | 21.5KiB | 0 | 0.0967261 | False | False |
| http_to_http_acks | -456.67KiB | -2.53 | 94.34% | 17.65MiB | 8.11MiB | 169.6KiB | 0 | 0.459505 | 17.2MiB | 8.1MiB | 169.09KiB | 0 | 0.470596 | True | True |
| syslog_log2metric_splunk_hec_metrics | -626.03KiB | -3.39 | 100.00% | 18.02MiB | 546.28KiB | 11.13KiB | 0 | 0.0295932 | 17.41MiB | 762.43KiB | 15.52KiB | 0 | 0.0427533 | False | False |
| syslog_humio_logs | -613.46KiB | -3.57 | 100.00% | 16.79MiB | 133.03KiB | 2.72KiB | 0 | 0.00773777 | 16.19MiB | 542.76KiB | 11.12KiB | 0 | 0.0327375 | False | False |
| datadog_agent_remap_datadog_logs_acks | -2.23MiB | -3.63 | 100.00% | 61.48MiB | 3.02MiB | 63.21KiB | 0 | 0.0491566 | 59.25MiB | 4.64MiB | 96.65KiB | 0 | 0.0783425 | False | False |
| socket_to_socket_blackhole | -889.8KiB | -3.65 | 100.00% | 23.8MiB | 194.37KiB | 3.97KiB | 0 | 0.00797235 | 22.93MiB | 107.48KiB | 2.19KiB | 0 | 0.00457538 | False | False |
I'm looking at the CEF v26 specification from here: https://www.microfocus.com/documentation/arcsight/arcsight-smartconnectors-8.3/pdfdoc/cef-implementation-standard/cef-implementation-standard.pdf
I'm not very familiar with this format, but the documentation seems to imply the most common format is with a syslog prefix, which the current implementation doesn't support. It would be good to support that, or explain why it's not needed.
For example, I expected the following example to parse correctly
Sep 29 08:26:10 host CEF:1|Security|threatmanager|1.0|100|worm successfully stopped|10|src=10.0.0.1 dst=2.1.2.2 spt=1232
@fuchsnj this now works as expected.
Soak Test Results
Baseline: d498040a770ae2bb5c9d25efce62acadcb17ee57 Comparison: dfe24fd5af7722d726b8bd9670cb54ea8b8204d5 Total Vector CPUs: 4
Explanation
A soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core.
The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed.
No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%:
Fine details of change detection per experiment.
| experiment | Δ mean | Δ mean % | confidence | baseline mean | baseline stdev | baseline stderr | baseline outlier % | baseline CoV | comparison mean | comparison stdev | comparison stderr | comparison outlier % | comparison CoV | erratic | declared erratic |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| http_pipelines_blackhole_acks | 10.65KiB | 0.86 | 99.99% | 1.21MiB | 116.89KiB | 2.38KiB | 0 | 0.0943029 | 1.22MiB | 70.35KiB | 1.43KiB | 0 | 0.0562726 | False | False |
| syslog_loki | 127.0KiB | 0.86 | 100.00% | 14.37MiB | 265.59KiB | 5.44KiB | 0 | 0.0180406 | 14.5MiB | 748.05KiB | 15.21KiB | 0 | 0.0503778 | False | False |
| http_text_to_http_json | 48.6KiB | 0.12 | 94.31% | 38.14MiB | 905.12KiB | 18.48KiB | 0 | 0.0231674 | 38.19MiB | 862.46KiB | 17.6KiB | 0 | 0.0220482 | False | False |
| splunk_hec_to_splunk_hec_logs_noack | 5.23KiB | 0.02 | 38.59% | 23.83MiB | 380.09KiB | 7.77KiB | 0 | 0.0155712 | 23.84MiB | 336.19KiB | 6.86KiB | 0 | 0.0137699 | False | False |
| enterprise_http_to_http | 650.73B | 0 | 6.93% | 23.85MiB | 251.21KiB | 5.13KiB | 0 | 0.0102859 | 23.85MiB | 254.6KiB | 5.21KiB | 0 | 0.0104243 | False | False |
| http_pipelines_blackhole | -255.45B | -0.01 | 7.84% | 1.66MiB | 49.62KiB | 1.01KiB | 0 | 0.0291759 | 1.66MiB | 113.99KiB | 2.32KiB | 0 | 0.0670314 | False | False |
| splunk_hec_to_splunk_hec_logs_acks | -7.48KiB | -0.03 | 25.34% | 23.77MiB | 788.62KiB | 16.05KiB | 0 | 0.0323959 | 23.76MiB | 818.49KiB | 16.66KiB | 0 | 0.0336332 | False | False |
| splunk_hec_indexer_ack_blackhole | -12.26KiB | -0.05 | 35.70% | 23.75MiB | 889.01KiB | 18.09KiB | 0 | 0.0365419 | 23.74MiB | 948.56KiB | 19.29KiB | 0 | 0.0390094 | False | False |
| file_to_blackhole | -48.17KiB | -0.05 | 39.37% | 95.35MiB | 3.04MiB | 63.01KiB | 0 | 0.0318738 | 95.3MiB | 3.32MiB | 69.02KiB | 0 | 0.0348198 | False | False |
| http_to_http_noack | -25.99KiB | -0.11 | 83.62% | 23.83MiB | 521.42KiB | 10.65KiB | 0 | 0.0213647 | 23.8MiB | 751.72KiB | 15.33KiB | 0 | 0.0308342 | False | False |
| http_to_http_json | -42.0KiB | -0.17 | 99.86% | 23.85MiB | 327.46KiB | 6.69KiB | 0 | 0.0134073 | 23.81MiB | 555.67KiB | 11.34KiB | 0 | 0.0227901 | False | False |
| fluent_elasticsearch | -157.2KiB | -0.19 | 100.00% | 79.47MiB | 52.52KiB | 1.06KiB | 0 | 0.000645199 | 79.32MiB | 1.56MiB | 32.06KiB | 0 | 0.0196179 | False | False |
| http_to_http_acks | -59.0KiB | -0.33 | 19.65% | 17.4MiB | 8.03MiB | 167.93KiB | 0 | 0.461386 | 17.34MiB | 8.03MiB | 167.35KiB | 0 | 0.462666 | True | True |
| syslog_log2metric_humio_metrics | -44.7KiB | -0.35 | 100.00% | 12.34MiB | 241.99KiB | 4.94KiB | 0 | 0.0191518 | 12.29MiB | 473.09KiB | 9.63KiB | 0 | 0.0375747 | False | False |
| datadog_agent_remap_blackhole_acks | -303.19KiB | -0.48 | 99.78% | 61.56MiB | 3.99MiB | 83.12KiB | 0 | 0.0647964 | 61.27MiB | 2.55MiB | 53.43KiB | 0 | 0.0416761 | False | False |
| splunk_hec_route_s3 | -104.64KiB | -0.56 | 86.98% | 18.11MiB | 2.38MiB | 49.49KiB | 0 | 0.131177 | 18.0MiB | 2.31MiB | 48.27KiB | 0 | 0.128191 | False | False |
| datadog_agent_remap_datadog_logs_acks | -451.39KiB | -0.71 | 100.00% | 62.24MiB | 2.8MiB | 58.63KiB | 0 | 0.045018 | 61.8MiB | 4.39MiB | 91.36KiB | 0 | 0.0710098 | False | False |
| datadog_agent_remap_blackhole | -441.06KiB | -0.8 | 93.62% | 53.91MiB | 7.92MiB | 165.07KiB | 0 | 0.146824 | 53.48MiB | 8.21MiB | 171.36KiB | 0 | 0.153427 | False | False |
| datadog_agent_remap_datadog_logs | -518.57KiB | -0.81 | 100.00% | 62.27MiB | 303.97KiB | 6.22KiB | 0 | 0.00476598 | 61.76MiB | 3.8MiB | 79.25KiB | 0 | 0.0615644 | False | False |
| syslog_splunk_hec_logs | -196.61KiB | -1.17 | 100.00% | 16.37MiB | 809.69KiB | 16.46KiB | 0 | 0.0482945 | 16.18MiB | 678.41KiB | 13.8KiB | 0 | 0.0409445 | False | False |
| syslog_regex_logs2metric_ddmetrics | -208.46KiB | -1.62 | 100.00% | 12.58MiB | 599.76KiB | 12.22KiB | 0 | 0.046555 | 12.37MiB | 444.36KiB | 9.06KiB | 0 | 0.0350598 | False | False |
| syslog_humio_logs | -305.38KiB | -1.81 | 100.00% | 16.45MiB | 494.52KiB | 10.1KiB | 0 | 0.0293471 | 16.15MiB | 468.01KiB | 9.59KiB | 0 | 0.0282863 | False | False |
| syslog_log2metric_splunk_hec_metrics | -449.62KiB | -2.43 | 100.00% | 18.08MiB | 488.35KiB | 9.96KiB | 0 | 0.026373 | 17.64MiB | 679.97KiB | 13.85KiB | 0 | 0.0376353 | False | False |
| http_pipelines_no_grok_blackhole | -287.41KiB | -2.58 | 100.00% | 10.89MiB | 53.36KiB | 1.09KiB | 0 | 0.00478409 | 10.61MiB | 990.72KiB | 20.16KiB | 0 | 0.0911755 | False | False |
| socket_to_socket_blackhole | -632.2KiB | -2.61 | 100.00% | 23.65MiB | 382.39KiB | 7.81KiB | 0 | 0.0157892 | 23.03MiB | 163.43KiB | 3.34KiB | 0 | 0.00692898 | False | False |
Example being the
syslogsource rejects invalid messages when it fails to decode, and the case of "pseudo-syslog" - you can use thesocketsource and handle the decoding in VRL.
Yes, indeed we parse with VRL + socket source instead of the syslog source directly due to the need to properly log errors during parsing, i.e. log the offending packet and source peer. This is something that the sources in general are not very good at the moment (see #7750) :(
So let's after all make this translation a separate feature. It seems like it will require an option so that it's opt in. Once active it can change the type definition of the parser so that it returns string and array. Or at least that seems like a way to do it.
I fully agree to better move this part to another PR so this one here can make progress. Indeed looks like that part needs more discussion. Apologies for the noise!
Soak Test Results
Baseline: 512da4076a67a229996f96e51e711dc2af37dcf2 Comparison: 8117579af027460f10ea0bbb5956163dd1aa3a2a Total Vector CPUs: 4
Explanation
A soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core.
The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed.
No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%:
Fine details of change detection per experiment.
| experiment | Δ mean | Δ mean % | confidence | baseline mean | baseline stdev | baseline stderr | baseline outlier % | baseline CoV | comparison mean | comparison stdev | comparison stderr | comparison outlier % | comparison CoV | erratic | declared erratic |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| http_to_http_acks | 287.83KiB | 1.66 | 77.76% | 16.96MiB | 7.98MiB | 166.96KiB | 0 | 0.470654 | 17.24MiB | 7.98MiB | 166.57KiB | 0 | 0.462744 | True | True |
| http_pipelines_blackhole_acks | 15.37KiB | 1.24 | 100.00% | 1.21MiB | 103.73KiB | 2.11KiB | 0 | 0.0839684 | 1.22MiB | 91.41KiB | 1.86KiB | 0 | 0.0730894 | False | False |
| syslog_loki | 104.39KiB | 0.72 | 100.00% | 14.16MiB | 406.01KiB | 8.32KiB | 0 | 0.0279889 | 14.27MiB | 722.48KiB | 14.69KiB | 0 | 0.0494497 | False | False |
| splunk_hec_to_splunk_hec_logs_acks | 22.03KiB | 0.09 | 64.20% | 23.75MiB | 871.27KiB | 17.72KiB | 0 | 0.0358235 | 23.77MiB | 792.59KiB | 16.13KiB | 0 | 0.0325592 | False | False |
| splunk_hec_indexer_ack_blackhole | 22.42KiB | 0.09 | 64.97% | 23.75MiB | 876.66KiB | 17.83KiB | 0 | 0.0360423 | 23.77MiB | 789.74KiB | 16.07KiB | 0 | 0.0324387 | False | False |
| splunk_hec_to_splunk_hec_logs_noack | 23.0KiB | 0.09 | 93.56% | 23.82MiB | 503.11KiB | 10.27KiB | 0 | 0.0206241 | 23.84MiB | 343.25KiB | 7.01KiB | 0 | 0.0140575 | False | False |
| enterprise_http_to_http | 55.43B | 0 | 0.59% | 23.85MiB | 253.86KiB | 5.18KiB | 0 | 0.010394 | 23.85MiB | 254.69KiB | 5.21KiB | 0 | 0.0104279 | False | False |
| file_to_blackhole | -65.46KiB | -0.07 | 55.98% | 95.35MiB | 2.72MiB | 56.41KiB | 0 | 0.0285348 | 95.29MiB | 3.04MiB | 63.31KiB | 0 | 0.0319455 | False | False |
| http_to_http_json | -33.03KiB | -0.14 | 99.16% | 23.85MiB | 335.84KiB | 6.86KiB | 0 | 0.0137507 | 23.81MiB | 512.63KiB | 10.47KiB | 0 | 0.0210177 | False | False |
| fluent_elasticsearch | -159.74KiB | -0.2 | 100.00% | 79.47MiB | 54.02KiB | 1.09KiB | 0 | 0.000663709 | 79.32MiB | 1.41MiB | 28.9KiB | 0 | 0.0177107 | False | False |
| http_to_http_noack | -76.91KiB | -0.32 | 99.85% | 23.83MiB | 508.86KiB | 10.4KiB | 0 | 0.0208505 | 23.75MiB | 1.05MiB | 21.9KiB | 0 | 0.0442039 | False | False |
| syslog_splunk_hec_logs | -155.74KiB | -0.96 | 100.00% | 15.89MiB | 812.2KiB | 16.52KiB | 0 | 0.0499162 | 15.73MiB | 620.69KiB | 12.67KiB | 0 | 0.0385151 | False | False |
| datadog_agent_remap_blackhole | -631.07KiB | -1.06 | 100.00% | 58.27MiB | 3.79MiB | 78.9KiB | 0 | 0.0649641 | 57.66MiB | 3.22MiB | 67.08KiB | 0 | 0.055759 | False | False |
| syslog_humio_logs | -190.72KiB | -1.18 | 100.00% | 15.83MiB | 666.56KiB | 13.61KiB | 0 | 0.0411163 | 15.64MiB | 588.87KiB | 12.05KiB | 0 | 0.0367565 | False | False |
| syslog_regex_logs2metric_ddmetrics | -163.06KiB | -1.29 | 100.00% | 12.37MiB | 614.64KiB | 12.53KiB | 0 | 0.0485166 | 12.21MiB | 584.3KiB | 11.91KiB | 0 | 0.0467231 | False | False |
| http_pipelines_blackhole | -28.02KiB | -1.65 | 100.00% | 1.66MiB | 55.85KiB | 1.14KiB | 0 | 0.0327915 | 1.64MiB | 123.86KiB | 2.52KiB | 0 | 0.0739303 | False | False |
| syslog_log2metric_splunk_hec_metrics | -324.59KiB | -1.75 | 100.00% | 18.09MiB | 614.55KiB | 12.52KiB | 0 | 0.0331675 | 17.77MiB | 812.45KiB | 16.53KiB | 0 | 0.0446304 | False | False |
| syslog_log2metric_humio_metrics | -247.28KiB | -1.88 | 100.00% | 12.82MiB | 197.12KiB | 4.03KiB | 0 | 0.0150086 | 12.58MiB | 508.37KiB | 10.35KiB | 0 | 0.039451 | False | False |
| splunk_hec_route_s3 | -407.38KiB | -2.11 | 100.00% | 18.84MiB | 2.29MiB | 47.8KiB | 0 | 0.121755 | 18.44MiB | 2.21MiB | 46.29KiB | 0 | 0.119914 | False | False |
| datadog_agent_remap_datadog_logs | -1.32MiB | -2.13 | 100.00% | 61.98MiB | 456.92KiB | 9.35KiB | 0 | 0.00719798 | 60.66MiB | 4.06MiB | 84.54KiB | 0 | 0.0669117 | False | False |
| http_pipelines_no_grok_blackhole | -257.15KiB | -2.34 | 100.00% | 10.73MiB | 332.93KiB | 6.8KiB | 0 | 0.0302982 | 10.48MiB | 1.08MiB | 22.58KiB | 0 | 0.103532 | False | False |
| datadog_agent_remap_datadog_logs_acks | -1.5MiB | -2.43 | 100.00% | 61.57MiB | 3.08MiB | 64.37KiB | 0 | 0.0500053 | 60.07MiB | 4.23MiB | 88.11KiB | 0 | 0.0704481 | False | False |
| datadog_agent_remap_blackhole_acks | -1.46MiB | -2.47 | 100.00% | 58.94MiB | 4.24MiB | 88.34KiB | 0 | 0.0719742 | 57.48MiB | 2.91MiB | 60.9KiB | 0 | 0.0506441 | False | False |
| http_text_to_http_json | -1.85MiB | -4.67 | 100.00% | 39.57MiB | 1.11MiB | 23.17KiB | 0 | 0.0280107 | 37.73MiB | 1005.7KiB | 20.54KiB | 0 | 0.0260275 | False | False |
| socket_to_socket_blackhole | -1.49MiB | -6.06 | 100.00% | 24.6MiB | 289.29KiB | 5.91KiB | 0 | 0.0114829 | 23.11MiB | 124.6KiB | 2.54KiB | 0 | 0.0052645 | False | False |
Soak Test Results
Baseline: 9cf1ea9b08ed745e3872c1cc81757f6078c82419 Comparison: acb41109c3d5e6839de9a32542e1f75fc58433bf Total Vector CPUs: 4
Explanation
A soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core.
The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed.
No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%:
Fine details of change detection per experiment.
| experiment | Δ mean | Δ mean % | confidence | baseline mean | baseline stdev | baseline stderr | baseline outlier % | baseline CoV | comparison mean | comparison stdev | comparison stderr | comparison outlier % | comparison CoV | erratic | declared erratic |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| http_pipelines_blackhole_acks | 20.56KiB | 1.69 | 100.00% | 1.19MiB | 137.87KiB | 2.8KiB | 0 | 0.11345 | 1.21MiB | 99.87KiB | 2.04KiB | 0 | 0.0808164 | False | False |
| http_pipelines_blackhole | 16.22KiB | 0.97 | 100.00% | 1.62MiB | 106.62KiB | 2.18KiB | 0 | 0.0640698 | 1.64MiB | 144.67KiB | 2.95KiB | 0 | 0.0860995 | False | False |
| socket_to_socket_blackhole | 31.99KiB | 0.14 | 72.53% | 22.6MiB | 995.23KiB | 20.32KiB | 0 | 0.0429963 | 22.63MiB | 1.01MiB | 21.09KiB | 0 | 0.044557 | False | False |
| splunk_hec_to_splunk_hec_logs_acks | 19.3KiB | 0.08 | 57.46% | 23.75MiB | 880.16KiB | 17.9KiB | 0 | 0.0361888 | 23.77MiB | 801.26KiB | 16.31KiB | 0 | 0.0329186 | False | False |
| splunk_hec_to_splunk_hec_logs_noack | 9.13KiB | 0.04 | 59.95% | 23.83MiB | 416.36KiB | 8.51KiB | 0 | 0.017058 | 23.84MiB | 330.17KiB | 6.74KiB | 0 | 0.0135217 | False | False |
| enterprise_http_to_http | -1.39KiB | -0.01 | 15.28% | 23.85MiB | 247.79KiB | 5.06KiB | 0 | 0.0101454 | 23.85MiB | 250.84KiB | 5.13KiB | 0 | 0.0102708 | False | False |
| splunk_hec_indexer_ack_blackhole | -1.66KiB | -0.01 | 5.14% | 23.75MiB | 883.93KiB | 17.98KiB | 0 | 0.0363378 | 23.75MiB | 906.22KiB | 18.43KiB | 0 | 0.0372565 | False | False |
| file_to_blackhole | -54.77KiB | -0.06 | 43.16% | 95.34MiB | 3.03MiB | 62.74KiB | 0 | 0.031738 | 95.29MiB | 3.49MiB | 72.67KiB | 0 | 0.0366625 | False | False |
| http_to_http_json | -26.36KiB | -0.11 | 97.46% | 23.85MiB | 333.92KiB | 6.82KiB | 0 | 0.0136714 | 23.82MiB | 470.3KiB | 9.62KiB | 0 | 0.0192759 | False | False |
| fluent_elasticsearch | -182.93KiB | -0.22 | 100.00% | 79.47MiB | 53.72KiB | 1.09KiB | 0 | 0.000660026 | 79.29MiB | 1.58MiB | 32.55KiB | 0 | 0.0199407 | False | False |
| datadog_agent_remap_blackhole_acks | -168.45KiB | -0.28 | 89.49% | 58.75MiB | 4.13MiB | 85.94KiB | 0 | 0.0702393 | 58.59MiB | 2.8MiB | 58.44KiB | 0 | 0.0477071 | False | False |
| http_to_http_acks | -64.77KiB | -0.36 | 21.36% | 17.34MiB | 8.14MiB | 170.18KiB | 0 | 0.469447 | 17.27MiB | 8.04MiB | 167.86KiB | 0 | 0.465609 | True | True |
| http_to_http_noack | -122.45KiB | -0.5 | 100.00% | 23.84MiB | 408.45KiB | 8.35KiB | 0 | 0.01673 | 23.72MiB | 1.23MiB | 25.68KiB | 0 | 0.0519597 | False | False |
| syslog_regex_logs2metric_ddmetrics | -79.45KiB | -0.62 | 100.00% | 12.45MiB | 617.85KiB | 12.58KiB | 0 | 0.048442 | 12.38MiB | 549.4KiB | 11.2KiB | 0 | 0.0433459 | False | False |
| splunk_hec_route_s3 | -145.12KiB | -0.78 | 97.02% | 18.21MiB | 2.28MiB | 47.44KiB | 0 | 0.125066 | 18.07MiB | 2.25MiB | 46.99KiB | 0 | 0.124455 | False | False |
| syslog_loki | -116.12KiB | -0.8 | 100.00% | 14.14MiB | 627.25KiB | 12.83KiB | 0 | 0.043312 | 14.03MiB | 822.42KiB | 16.72KiB | 0 | 0.0572474 | False | False |
| syslog_splunk_hec_logs | -136.85KiB | -0.83 | 100.00% | 16.17MiB | 744.94KiB | 15.16KiB | 0 | 0.0449838 | 16.03MiB | 520.99KiB | 10.64KiB | 0 | 0.0317228 | False | False |
| datadog_agent_remap_datadog_logs_acks | -823.21KiB | -1.29 | 100.00% | 62.42MiB | 3.14MiB | 65.65KiB | 0 | 0.0503053 | 61.61MiB | 4.37MiB | 90.97KiB | 0 | 0.0709139 | False | False |
| syslog_humio_logs | -256.51KiB | -1.51 | 100.00% | 16.6MiB | 259.69KiB | 5.3KiB | 0 | 0.0152771 | 16.35MiB | 259.09KiB | 5.3KiB | 0 | 0.0154754 | False | False |
| http_pipelines_no_grok_blackhole | -174.14KiB | -1.56 | 100.00% | 10.89MiB | 257.58KiB | 5.26KiB | 0 | 0.0231027 | 10.72MiB | 968.39KiB | 19.71KiB | 0 | 0.088236 | False | False |
| datadog_agent_remap_datadog_logs | -1003.85KiB | -1.61 | 100.00% | 60.76MiB | 1.75MiB | 36.67KiB | 0 | 0.0287631 | 59.78MiB | 4.31MiB | 89.72KiB | 0 | 0.0720615 | False | False |
| syslog_log2metric_splunk_hec_metrics | -299.09KiB | -1.67 | 100.00% | 17.48MiB | 835.59KiB | 17.02KiB | 0 | 0.0466667 | 17.19MiB | 941.02KiB | 19.14KiB | 0 | 0.053448 | False | False |
| datadog_agent_remap_blackhole | -1.24MiB | -2.11 | 100.00% | 59.05MiB | 4.7MiB | 97.93KiB | 0 | 0.0795947 | 57.8MiB | 3.58MiB | 74.71KiB | 0 | 0.0618756 | False | False |
| http_text_to_http_json | -1.09MiB | -2.76 | 100.00% | 39.57MiB | 744.78KiB | 15.2KiB | 0 | 0.0183758 | 38.48MiB | 832.95KiB | 17.01KiB | 0 | 0.0211342 | False | False |
| syslog_log2metric_humio_metrics | -498.83KiB | -3.85 | 100.00% | 12.65MiB | 289.72KiB | 5.91KiB | 0 | 0.0223601 | 12.16MiB | 781.76KiB | 15.9KiB | 0 | 0.0627512 | False | False |
Soak Test Results
Baseline: 9cf1ea9b08ed745e3872c1cc81757f6078c82419 Comparison: 8c3de2a6b624ec0fe7713567fe341a0f1158b24f Total Vector CPUs: 4
Explanation
A soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core.
The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed.
No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%:
Fine details of change detection per experiment.
| experiment | Δ mean | Δ mean % | confidence | baseline mean | baseline stdev | baseline stderr | baseline outlier % | baseline CoV | comparison mean | comparison stdev | comparison stderr | comparison outlier % | comparison CoV | erratic | declared erratic |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| socket_to_socket_blackhole | 747.72KiB | 3.43 | 100.00% | 21.3MiB | 2.01MiB | 41.96KiB | 0 | 0.0942174 | 22.03MiB | 1.72MiB | 35.92KiB | 0 | 0.0779846 | False | False |
| http_pipelines_blackhole_acks | 11.79KiB | 0.94 | 100.00% | 1.22MiB | 104.27KiB | 2.12KiB | 0 | 0.0833092 | 1.23MiB | 93.67KiB | 1.91KiB | 0 | 0.0741388 | False | False |
| http_to_http_acks | 54.79KiB | 0.31 | 18.26% | 17.33MiB | 7.98MiB | 166.85KiB | 0 | 0.460437 | 17.39MiB | 8.09MiB | 168.66KiB | 0 | 0.465062 | True | True |
| syslog_loki | 39.51KiB | 0.27 | 98.12% | 14.46MiB | 383.41KiB | 7.85KiB | 0 | 0.0258889 | 14.5MiB | 731.57KiB | 14.87KiB | 0 | 0.049266 | False | False |
| splunk_hec_to_splunk_hec_logs_noack | 32.38KiB | 0.13 | 98.30% | 23.81MiB | 574.25KiB | 11.71KiB | 0 | 0.0235504 | 23.84MiB | 334.42KiB | 6.83KiB | 0 | 0.0136966 | False | False |
| syslog_log2metric_humio_metrics | 2.27KiB | 0.02 | 16.58% | 12.52MiB | 218.43KiB | 4.46KiB | 0 | 0.0170287 | 12.53MiB | 486.38KiB | 9.91KiB | 0 | 0.0379108 | False | False |
| splunk_hec_indexer_ack_blackhole | -1.21KiB | -0 | 3.88% | 23.75MiB | 868.24KiB | 17.66KiB | 0 | 0.0356912 | 23.75MiB | 863.39KiB | 17.56KiB | 0 | 0.0354933 | False | False |
| enterprise_http_to_http | 422.28B | 0 | 4.47% | 23.84MiB | 254.73KiB | 5.2KiB | 0 | 0.0104301 | 23.85MiB | 254.67KiB | 5.21KiB | 0 | 0.0104274 | False | False |
| splunk_hec_to_splunk_hec_logs_acks | -3.72KiB | -0.02 | 12.06% | 23.75MiB | 841.97KiB | 17.13KiB | 0 | 0.0346083 | 23.75MiB | 861.71KiB | 17.53KiB | 0 | 0.035425 | False | False |
| file_to_blackhole | -58.69KiB | -0.06 | 48.23% | 95.36MiB | 2.83MiB | 58.7KiB | 0 | 0.0296878 | 95.3MiB | 3.33MiB | 69.18KiB | 0 | 0.0348893 | False | False |
| datadog_agent_remap_blackhole | -82.01KiB | -0.13 | 55.54% | 61.22MiB | 4.18MiB | 87.13KiB | 0 | 0.0683079 | 61.14MiB | 3.0MiB | 62.57KiB | 0 | 0.0490487 | False | False |
| http_to_http_json | -38.04KiB | -0.16 | 99.65% | 23.84MiB | 345.85KiB | 7.06KiB | 0 | 0.0141617 | 23.81MiB | 535.13KiB | 10.92KiB | 0 | 0.0219462 | False | False |
| fluent_elasticsearch | -206.94KiB | -0.25 | 100.00% | 79.47MiB | 52.35KiB | 1.06KiB | 0 | 0.000643189 | 79.27MiB | 1.77MiB | 36.49KiB | 0 | 0.0223655 | False | False |
| http_to_http_noack | -61.85KiB | -0.25 | 99.35% | 23.83MiB | 515.0KiB | 10.53KiB | 0 | 0.0211007 | 23.77MiB | 987.13KiB | 20.11KiB | 0 | 0.0405479 | False | False |
| http_pipelines_blackhole | -4.88KiB | -0.29 | 88.04% | 1.64MiB | 71.61KiB | 1.46KiB | 0 | 0.042531 | 1.64MiB | 135.92KiB | 2.77KiB | 0 | 0.0809564 | False | False |
| syslog_regex_logs2metric_ddmetrics | -71.58KiB | -0.55 | 100.00% | 12.6MiB | 631.77KiB | 12.87KiB | 0 | 0.0489578 | 12.53MiB | 495.86KiB | 10.11KiB | 0 | 0.0386401 | False | False |
| datadog_agent_remap_blackhole_acks | -361.51KiB | -0.57 | 99.92% | 61.99MiB | 4.14MiB | 86.28KiB | 0 | 0.0668152 | 61.63MiB | 3.09MiB | 64.57KiB | 0 | 0.0501094 | False | False |
| datadog_agent_remap_datadog_logs | -687.27KiB | -1.1 | 100.00% | 61.26MiB | 811.04KiB | 16.59KiB | 0 | 0.0129256 | 60.59MiB | 4.08MiB | 84.95KiB | 0 | 0.067311 | False | False |
| datadog_agent_remap_datadog_logs_acks | -803.9KiB | -1.26 | 100.00% | 62.24MiB | 3.43MiB | 71.62KiB | 0 | 0.0550889 | 61.45MiB | 4.36MiB | 90.76KiB | 0 | 0.0709335 | False | False |
| syslog_splunk_hec_logs | -210.39KiB | -1.27 | 100.00% | 16.22MiB | 952.33KiB | 19.37KiB | 0 | 0.0573089 | 16.02MiB | 803.0KiB | 16.38KiB | 0 | 0.0489421 | False | False |
| syslog_humio_logs | -245.07KiB | -1.42 | 100.00% | 16.8MiB | 112.3KiB | 2.29KiB | 0 | 0.00652474 | 16.57MiB | 106.35KiB | 2.18KiB | 0 | 0.00626818 | False | False |
| http_pipelines_no_grok_blackhole | -180.72KiB | -1.62 | 100.00% | 10.89MiB | 89.12KiB | 1.82KiB | 0 | 0.00798668 | 10.72MiB | 1.06MiB | 22.16KiB | 0 | 0.0993002 | False | False |
| splunk_hec_route_s3 | -328.95KiB | -1.69 | 100.00% | 18.98MiB | 2.23MiB | 46.41KiB | 0 | 0.117373 | 18.66MiB | 2.24MiB | 46.78KiB | 0 | 0.119893 | False | False |
| syslog_log2metric_splunk_hec_metrics | -316.27KiB | -1.76 | 100.00% | 17.52MiB | 635.89KiB | 12.97KiB | 0 | 0.035443 | 17.21MiB | 899.86KiB | 18.3KiB | 0 | 0.0510564 | False | False |
| http_text_to_http_json | -1.13MiB | -2.87 | 100.00% | 39.36MiB | 798.55KiB | 16.3KiB | 0 | 0.0198089 | 38.23MiB | 867.51KiB | 17.72KiB | 0 | 0.0221545 | False | False |
Soak Test Results
Baseline: 197ed5b27452aee5b51ba4db2443ca3ac1814634 Comparison: ef19c4faf1699cb058e301c57718fcf744b34422 Total Vector CPUs: 4
Explanation
A soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core.
The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed.
No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%:
Fine details of change detection per experiment.
| experiment | Δ mean | Δ mean % | confidence | baseline mean | baseline stdev | baseline stderr | baseline outlier % | baseline CoV | comparison mean | comparison stdev | comparison stderr | comparison outlier % | comparison CoV | erratic | declared erratic |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| socket_to_socket_blackhole | 491.96KiB | 2.12 | 100.00% | 22.7MiB | 118.06KiB | 2.41KiB | 0 | 0.00507806 | 23.18MiB | 105.45KiB | 2.15KiB | 0 | 0.0044418 | False | False |
| http_pipelines_blackhole_acks | 16.74KiB | 1.38 | 100.00% | 1.19MiB | 119.31KiB | 2.43KiB | 0 | 0.098001 | 1.2MiB | 72.6KiB | 1.48KiB | 0 | 0.0588225 | False | False |
| syslog_log2metric_splunk_hec_metrics | 224.39KiB | 1.32 | 100.00% | 16.54MiB | 1.06MiB | 22.15KiB | 0 | 0.0641589 | 16.76MiB | 1.05MiB | 21.9KiB | 0 | 0.0626851 | False | False |
| syslog_splunk_hec_logs | 54.01KiB | 0.33 | 98.12% | 15.77MiB | 853.49KiB | 17.36KiB | 0 | 0.0528559 | 15.82MiB | 738.48KiB | 15.06KiB | 0 | 0.0455808 | False | False |
| splunk_hec_to_splunk_hec_logs_noack | 47.03KiB | 0.19 | 99.80% | 23.79MiB | 666.62KiB | 13.58KiB | 0 | 0.027354 | 23.84MiB | 335.37KiB | 6.85KiB | 0 | 0.013735 | False | False |
| splunk_hec_indexer_ack_blackhole | 15.58KiB | 0.06 | 46.60% | 23.74MiB | 889.03KiB | 18.08KiB | 0 | 0.0365617 | 23.76MiB | 852.14KiB | 17.34KiB | 0 | 0.0350223 | False | False |
| enterprise_http_to_http | -760.44B | -0 | 8.00% | 23.85MiB | 256.17KiB | 5.23KiB | 0 | 0.0104887 | 23.85MiB | 255.94KiB | 5.23KiB | 0 | 0.0104796 | False | False |
| splunk_hec_to_splunk_hec_logs_acks | 0B | -0 | 0.00% | 23.74MiB | 892.73KiB | 18.15KiB | 0 | 0.0367117 | 23.74MiB | 900.69KiB | 18.31KiB | 0 | 0.0370391 | False | False |
| syslog_humio_logs | -3.56KiB | -0.02 | 53.07% | 16.49MiB | 194.81KiB | 3.98KiB | 0 | 0.0115376 | 16.48MiB | 141.54KiB | 2.9KiB | 0 | 0.00838435 | False | False |
| file_to_blackhole | -55.68KiB | -0.06 | 46.03% | 95.35MiB | 2.87MiB | 59.42KiB | 0 | 0.0300581 | 95.29MiB | 3.3MiB | 68.64KiB | 0 | 0.0346178 | False | False |
| http_pipelines_blackhole | -1.21KiB | -0.07 | 36.60% | 1.68MiB | 41.7KiB | 872.79B | 0 | 0.0242936 | 1.67MiB | 117.12KiB | 2.39KiB | 0 | 0.068274 | False | False |
| http_to_http_json | -34.31KiB | -0.14 | 99.38% | 23.85MiB | 327.0KiB | 6.68KiB | 0 | 0.0133884 | 23.81MiB | 518.66KiB | 10.59KiB | 0 | 0.0212654 | False | False |
| syslog_regex_logs2metric_ddmetrics | -28.72KiB | -0.23 | 97.22% | 12.01MiB | 453.97KiB | 9.25KiB | 0 | 0.0369163 | 11.98MiB | 451.63KiB | 9.2KiB | 0 | 0.0368117 | False | False |
| http_to_http_noack | -62.05KiB | -0.25 | 99.40% | 23.83MiB | 514.2KiB | 10.5KiB | 0 | 0.021068 | 23.77MiB | 981.09KiB | 19.99KiB | 0 | 0.0403001 | False | False |
| http_to_http_acks | -46.25KiB | -0.26 | 15.65% | 17.3MiB | 8.0MiB | 167.38KiB | 0 | 0.462494 | 17.26MiB | 7.85MiB | 163.89KiB | 0 | 0.454519 | True | True |
| fluent_elasticsearch | -215.31KiB | -0.26 | 100.00% | 79.47MiB | 54.25KiB | 1.1KiB | 0 | 0.000666549 | 79.26MiB | 2.25MiB | 46.2KiB | 0 | 0.0283342 | False | False |
| datadog_agent_remap_blackhole_acks | -189.65KiB | -0.31 | 91.55% | 59.91MiB | 4.42MiB | 92.01KiB | 0 | 0.0737631 | 59.73MiB | 2.88MiB | 60.13KiB | 0 | 0.04813 | False | False |
| syslog_loki | -144.53KiB | -0.99 | 100.00% | 14.3MiB | 485.42KiB | 9.95KiB | 0 | 0.0331493 | 14.16MiB | 863.07KiB | 17.54KiB | 0 | 0.0595269 | False | False |
| datadog_agent_remap_datadog_logs_acks | -742.73KiB | -1.18 | 100.00% | 61.31MiB | 3.09MiB | 64.52KiB | 0 | 0.050335 | 60.59MiB | 4.29MiB | 89.24KiB | 0 | 0.0707459 | False | False |
| splunk_hec_route_s3 | -301.18KiB | -1.57 | 100.00% | 18.73MiB | 2.27MiB | 47.25KiB | 0 | 0.121192 | 18.43MiB | 2.22MiB | 46.53KiB | 0 | 0.12064 | False | False |
| http_pipelines_no_grok_blackhole | -276.94KiB | -2.57 | 100.00% | 10.54MiB | 596.16KiB | 12.17KiB | 0 | 0.0552289 | 10.27MiB | 1.17MiB | 24.39KiB | 0 | 0.114079 | False | False |
| http_text_to_http_json | -1.07MiB | -2.9 | 100.00% | 36.87MiB | 2.51MiB | 52.37KiB | 0 | 0.0679495 | 35.8MiB | 2.72MiB | 56.72KiB | 0 | 0.07583 | False | False |
| datadog_agent_remap_datadog_logs | -1.99MiB | -3.4 | 100.00% | 58.48MiB | 4.6MiB | 96.54KiB | 0 | 0.0786976 | 56.49MiB | 6.22MiB | 129.5KiB | 0 | 0.110058 | False | False |
| syslog_log2metric_humio_metrics | -441.36KiB | -3.6 | 100.00% | 11.98MiB | 903.2KiB | 18.44KiB | 0 | 0.0736128 | 11.55MiB | 958.02KiB | 19.51KiB | 0 | 0.0809946 | False | False |
| datadog_agent_remap_blackhole | -2.67MiB | -4.43 | 100.00% | 60.2MiB | 4.23MiB | 88.26KiB | 0 | 0.0703133 | 57.54MiB | 3.89MiB | 81.18KiB | 0 | 0.0676018 | False | False |
@ktff today I had to help fixing some parsing errors in our regex-based CEF processing pipeline. I couldn't help myself but thinking that this PR will improve our lives dramatically. This feed in particular is ~40K EPS of CEF data (PaloAlto devices), and I will definitively test your parse_cef() implementation with it.
I noticed two particularities in the CEF data today, that I wanted to bring up for you to consider (if not already). Unfortunately I don't have the means currently to build/test your PR myself.
The first case is of CEF extension fields with empty values. For example app= msg= act=.
The second case is more funky: CEF extension fields with quoted values: msg="Some message.".
I wonder if for the first case, your implementation will return empty-valued keys or would discard them entirely?
And also, for the second case, if parse_cef() will strip the quotes from the value? That woud be really nice.
@ktff today I had to help fixing some parsing errors in our regex-based CEF processing pipeline. I couldn't help myself but thinking that this PR will improve our lives dramatically. This feed in particular is ~40K EPS of CEF data (PaloAlto devices), and I will definitively test your
parse_cef()implementation with it.I noticed two particularities in the CEF data today, that I wanted to bring up for you to consider (if not already). Unfortunately I don't have the means currently to build/test your PR myself.
The first case is of CEF extension fields with empty values. For example
app= msg= act=. The second case is more funky: CEF extension fields with quoted values:msg="Some message.".I wonder if for the first case, your implementation will return empty-valued keys or would discard them entirely? And also, for the second case, if
parse_cef()will strip the quotes from the value? That woud be really nice.
- Example with empty extension values (fails to parse)
$ parse_cef!("Sep 29 08:26:10 host CEF:1|Security|threatmanager|1.0|100|worm successfully stopped|10|src=10.0.0.1 dst= spt=")
function call error for "parse_cef" at (0:123): Could not parse whole line successfully
- Example with quoted extension value (quotes are kept)
$ parse_cef!("Sep 29 08:26:10 host CEF:1|Security|threatmanager|1.0|100|worm successfully stopped|10|src=10.0.0.1 dst=\"2.1.2.2\" spt=\"1232\"")
{ "cefVersion": "1", "deviceEventClassId": "100", "deviceProduct": "threatmanager", "deviceVendor": "Security", "deviceVersion": "1.0", "dst": "\"2.1.2.2\"", "name": "worm successfully stopped", "severity": "10", "spt": "\"1232\"", "src": "10.0.0.1" }
Another edge case I noticed. Empty extensions at the end seems to work fine, but empty extensions not at the end (example 1 above) fail.
$ parse_cef!("Sep 29 08:26:10 host CEF:1|Security|threatmanager|1.0|100|worm successfully stopped|10|src=10.0.0.1 spt=")
{ "cefVersion": "1", "deviceEventClassId": "100", "deviceProduct": "threatmanager", "deviceVendor": "Security", "deviceVersion": "1.0", "name": "worm successfully stopped", "severity": "10", "spt": "", "src": "10.0.0.1" }
However, if you have an "empty" extension with multiple spaces, it parses again (capturing the spaces)
$ parse_cef!("Sep 29 08:26:10 host CEF:1|Security|threatmanager|1.0|100|worm successfully stopped|10|src= spt=")
{ "cefVersion": "1", "deviceEventClassId": "100", "deviceProduct": "threatmanager", "deviceVendor": "Security", "deviceVersion": "1.0", "name": "worm successfully stopped", "severity": "10", "spt": "", "src": " " }
- I think supporting 1 above is reasonable, even though it is not mentioned in the spec. It should close up the edge cases mentioned above too.
- I'm a bit on the fence for quoted values, since it's not mentioned in the spec, and a user could conceivably prefer to have the quotes, but I'm open to discussion, or potentially adding this as an option (potentially defaulted on).