f5-telemetry-streaming
f5-telemetry-streaming copied to clipboard
Add virtualServers and clientSslProfiles labels to certain Telemetry Streaming metrics
Is your feature request related to a problem? Please describe.
- We want to show VirtualServers - clientSslProfiles relationship
- We want to show HTTP requests rate (2xx,3xx,4xx) by Client SSL Profile
Describe the solution you'd like
- Metrics:
- f5_currentNativeConnections
- f5_totNativeConns
Add label: virtualServers (currently has clientSslProfiles )
- Metrics:
- f5_numberReqs
- f5_2xxResp
- f5_3xxResp
- f5_4xxResp
- f5_5xxResp
Add labels: clientSslProfiles and virtualServers
Describe alternatives you've considered
Email from Matt Stovall: By using the telemetry streaming custom endpoint /mgmt/tm/ltm/virtual/profiles/stats you can get the equivalent metrics per virtual server. They are not using a Prometheus label, but the name of the virtual server is added to the metric name.
Example output from TS Pull consumer endpoint:
f5_vsProfileStats__Common_{{ virtual server name}}_Common_{{ virtual server name}}_profiles_stats__Common_{{ virtual server name}}_profiles_Common_{{clientSSL Profile name}}_stats_common_activeHandshakeRejected
f5_vsProfileStats__Common_{{ virtual server name}}_stats__Common_{{ virtual server name}}_profiles_stats__Common_{{ virtual server name}}_profiles_Common_{{clientSSL Profile name}}_stats_common_curNativeConns
f5_vsProfileStats__Common_{{ virtual server name}}__Common__{{ virtual server name}}_profiles_stats__Common_{{ virtual server name}}_profiles_Common_{{clientSSL Profile name}}__stats_common_currentActiveHandshakes
Example for virtual server name https_multi_cert and clientSSL profile default_sni:
f5_vsProfileStats__Common_https_multi_cert_stats__Common_https_multi_cert_profiles_stats__Common_https_multi_cert_profiles_Common_default_sni_stats_common_activeHandshakeRejected 0
To get these to show up in TS output, you just need to define another custom endpoint in your telemetry streaming declaration. You already have a few custom endpoints defined:
"Custom_Endpoints": {
"class": "Telemetry_Endpoints",
"items": {
"vsProfileStats": {
"name": "vsProfileStats",
"path": "/mgmt/tm/ltm/virtual/profiles/stats",
},
}
}
If you wanted to show the VS names as a label instead of in the name- that would take a new telemetry streaming github request. We can submit GitHub requests here: https://github.com/F5Networks/f5-telemetry-streaming/issues
Hello,
I think this is not a feature request it is a bug. If you use the default declaration the formatting of the metrics ist correct. If you collect the same metrics using CustomEndpoint the formatting is garbage.
Here the garbage metrics format of a custom Endpoint:
This is the same value (bitsIn/out) from the default declaration. If you do not configure anything, just enable the OpenTelemetry API for prometheus PULL it looks like this.:
And both use this source:
v1.33.0 and v1.34.0 of the OpenTelemetryPlugin
Hi @barakbd and @Nachtfalkeaw,
I'm the F5 solutions engineer working with Barak on their telemetry streaming initiatives.
It appears there are two different requests in this github issue.
1.) First Request
When you define a custom endpoint that provides statistics per virtual server such as /mgmt/tm/ltm/virtual/profiles/stats
, add the virtual server name to those metrics as a label. This seems reasonable to me- all the data for this is located in the control plane already.
For example, that custom endpoint can help provide insight into bits per virtual server instead of global bits in/out (which can be very useful to identify which virtual servers have higher throughout) produces a metric like this:
# HELP f5_vsProfileStats__Common_asm_demo_http_stats_clientside_bitsOut vsProfileStats_/Common/asm-demo-http/stats_clientside.bitsOut
# TYPE f5_vsProfileStats__Common_asm_demo_http_stats_clientside_bitsOut gauge
f5_vsProfileStats__Common_asm_demo_http_stats_clientside_bitsOut 1028544
The metric output is formatted in a way that is difficult to parse. The virtual server name /Common/asm_demo_http
is there- but its difficult to extract and then graph that this metric is the bitsOut for virtual server /Common/asm_demo_http
. Ideally the name could be improved and a prometheus friendly label could be added so that the metric looks like this instead:
# HELP f5_vsProfileStats__clientside_bitsOut vsProfileStats_/Common/asm-demo-http/stats_clientside.bitsOut
# TYPE f5_vsProfileStats__clientside_bitsOut gauge
f5_vsProfileStats__clientside_bitsOut{virtualServers="/Common/asm-demo-http"} 1028544
That way the metric could be natively graphed in prometheus/grafana as associated with virtual server /Common/asm_demo_http
. You could then graph bits per second by virtual server instead of only having global bits Out and not knowing which virtual servers are contributing to that.
2.) Second Request
Add labels for clientSslProfiles and virtualServers names to various metrics produced by clientSSL and HTTP profiles. To my knowledge, there is no data in TMOS that maps these things together that telemetry streaming could query.
I suggest we focus on the first request as that seems within the scope of TS and immediately useful.
Thanks!
The metrics names should be the same if I query the same metrics than in default configuration. The reason for that is pretty simple. If I query all metrics every 5 seconds the CPUs are overloaded. However for very limited amout of values apolling interval of 5s is usefull e.g. CPU and memory.
Other values like overall throughput there it is sufficient to poll every 15s and other things every 60s.
software versions, hw version ist relevant only e.g. every 6hrs.
So the idea of different Pull_Consumers is very good. However to use them the "Custom_Endpoints" must generate the same metric output than the default poll so that the metrics from different intervalls can be matched correctly - and not only matched correctly - they should be the same metric. if every Poller generates different metrics the result is duplicate metrics in Prometheus. The metrics from default poller for CPU and the metrics for CPU from Custom Endpoint.
However - if it is not possible to generate the same metrics name than the different metrics should share the same label sets so that it is possible to merge different metrics based on the same labels - and hopefully the labels unique identify that they are the same.
I can confirm this problem is also affecting me! Once I enable custom endpoints to filter out the results of the scrape (so I can avoid the CPU overload), the metrics get reported in a different pattern:
# HELP f5_detailedCPU_sys_host_info_0_sys_hostInfo_0_cpuInfo_sys_hostInfo_0_cpuInfo_1_oneMinAvgUser detailedCPU_sys/host-info/0_sys/hostInfo/0/cpuInfo_sys/hostInfo/0/cpuInfo/1_oneMinAvgUser
# TYPE f5_detailedCPU_sys_host_info_0_sys_hostInfo_0_cpuInfo_sys_hostInfo_0_cpuInfo_1_oneMinAvgUser gauge
f5_detailedCPU_sys_host_info_0_sys_hostInfo_0_cpuInfo_sys_hostInfo_0_cpuInfo_1_oneMinAvgUser 14
This also makes the process of finding which endpoints have the metrics I need pretty hard
@megamattzilla This has become a blocker for using TS to observe the BigIPs using Prometheus as the metrics engine, mainly because, on the one hand, we can't enable the collection of all metrics without seeing a significant impact on CPU usage. On the other hand, we can't use the custom endpoint approach as the current output doesn't allow for proper label matching, filtering, etc.
If we can't find a solution, we will be forced to use the snmp_exporter. I would gladly avoid that if possible, as it requires more configuration complexity.
Do you have any status updates that can be shared?
Hi @B0go,
Please contact your F5 account team so they can contact us (the product management team).
It would really helpful to simply allow TF metrics to have customized labels