benchmark-wrapper
benchmark-wrapper copied to clipboard
Add CLI Option to Purge Empty Fields in ES
Currently we don't do any sort of checks to see if a field is non-empty before shipping off to ES. This can lead to a lot of empty fields being sent out, depending on the use case and the benchmark. For instance, here is a document from a Uperf CI test ran by ripsaw, where most of the fields are populated through environment variables:
{
"_index" : "ripsaw-uperf-results-000002",
"_type" : "_doc",
"_id" : "297e5bb466870ad7f90916e68f60c440a43be6dbe0707ab6f3c4d7e30007e807",
"_score" : 7.654378,
"_source" : {
"workload" : "uperf",
"uuid" : "79690e49-479e-593a-8a51-0a1ef032de88",
"user" : "ripsaw",
"cluster_name" : "myk8scluster",
"hostnetwork" : "True",
"iteration" : 2,
"remote_ip" : "10.0.133.30",
"client_ips" : "10.0.173.62 10.130.0.1 ",
"uperf_ts" : "2021-06-01T22:32:39.594000",
"service_ip" : "False",
"bytes" : 519864320,
"norm_byte" : 258605056,
"ops" : 1015360,
"norm_ops" : 505088,
"norm_ltcy" : 2.3778479412250935,
"kind" : "pod",
"client_node" : "ip-10-0-173-62.us-west-2.compute.internal",
"server_node" : "unknown",
"num_pairs" : "1",
"multus_client" : "",
"networkpolicy" : "",
"density" : "1",
"nodes_in_iter" : "1",
"step_size" : "",
"colocate" : "False",
"density_range" : [ ],
"node_range" : [ ],
"pod_id" : "0",
"test_type" : "stream",
"protocol" : "udp",
"message_size" : 512,
"read_message_size" : 512,
"num_threads" : 2,
"duration" : 3,
"run_id" : "NA"
}
}
And here is a document exported from just running the command run_snafu --tool uperf --user ryan --uuid 1234 --proto tcp --remoteip localhost -w iperf.xml --resourcetype container -s 1 --verbose
:
{
"_index": "snafu-uperf-results",
"_op_type": "create",
"_source": {
"test_type": "",
"protocol": "tcp",
"message_size": null,
"read_message_size": null,
"num_threads": 1,
"duration": 31,
"kind": "container",
"hostnetwork": "False",
"remote_ip": "localhost",
"client_ips": "",
"service_ip": "False",
"client_node": "",
"server_node": "",
"num_pairs": "",
"multus_client": "",
"networkpolicy": "",
"density": "",
"nodes_in_iter": "",
"step_size": "",
"colocate": "",
"density_range": "",
"node_range": "",
"pod_id": null,
"uperf_ts": "2021-06-28T14:42:37.066000",
"bytes": 48083435520,
"norm_byte": 1505771520,
"ops": 5869560,
"norm_ops": 183810,
"norm_ltcy": 6.534098878427453,
"iteration": 1,
"user": "ryan",
"uuid": "1234",
"workload": "uperf",
"run_id": "NA"
},
"_id": "ae6d9dfc7083e94c569d1999c2eb2ae1dce4a77fc5c7052c128103783f9acc70",
"run_id": "NA"
}
I think it would be cool to add in a CLI option called --no-empty-fields
or something, that would remove any field from exported documents which is null or an empty string. This way teams only get the fields and the data that they care about, rather than also getting the extra fields which we use as a team.