reactor-netty icon indicating copy to clipboard operation
reactor-netty copied to clipboard

reactor_netty_http_server_connections_active contains randon local_address labels, and sometimes active>total

Open jtorkkel opened this issue 3 years ago • 3 comments

metrics reactor_netty_http_server_connections_active

in docker active connection gauge value is exposed twice with different local_address but same gauge value local_address="127.0.0.1:8080 local_address="10.0.4.238:8080" # this is same as scrape instance address tag date is missing

In kubernetes sometimes multiple local_address with same gauge value are exposed. local_address can be local_address="192.168.215.19:8080 # this is same as scrape instance address local_address can be localhost:8081 local_address can be randomServiceIngressName:8080 (might be multiple) local_address can be serviceOwnIngressName:80 local_address can be serviceOwnIngressName:443:80 tag data exists, same for all

metrics reactor_netty_http_server_connections_total contains only one local_address tag # this is same as scrape instance address

sometimes active > total which indicate that values cannot be trusted

Expected Behavior

local_address would make some sense, or does not exists active <=total

Actual Behavior

almost random local address with risk of high cardinality

Steps to Reproduce

Possible Solution

remove date and local_address label

Your Environment

  • Reactor version(s) used: 1.0.21 / springBoot 2.6.9
  • Other relevant libraries versions (eg. netty, ...):
  • JVM version (java -version): Java 11
  • OS and version (eg. uname -a): Linux

jtorkkel avatar Sep 14 '22 14:09 jtorkkel

@jtorkkel What is this tag date is missing we never add such tag. See https://github.com/reactor/reactor-netty/blob/472c2a1d4283f9dda3a02feca2dd2d349a3b5eba/reactor-netty-http/src/main/java/reactor/netty/http/server/MicrometerHttpServerMetricsRecorder.java#L228-L230

Possible Solution remove date and local_address label

if we remove local address this metrics is meaningless because you will never know for which server this is about.

Please provide some reproducible example that we can use in order to further investigate the issue.

As a workaround while we are investigating if this metric is problematic in your use case you can disable it via Micrometer API.

violetagg avatar Sep 14 '22 14:09 violetagg

Hi,

Actually date comes from __meta_kubernetes_pod_label_date, we use custom labels and SD labelcopy and this seems to come from k8s. But that is not the issue.

Issue is active contains extra local_address labels with random hostnames, total does not have these random tag values

reactor_netty_http_server_connections_total{app="info-v1", host="ap-216t.xxx.net",local_address="192.168.245.203:8080",uri="/"} 7.0

reactor_netty_http_server_connections_active{app="info-v1", host="ap-216t.xxx.net",local_address="service_x-v3.test01.qaadr.local:80",uri="/"} 5.0 reactor_netty_http_server_connections_active{app="info-v1", host="ap-216t.xxx.net",local_address="192.168.245.203:8080",uri="/"} 5.0 reactor_netty_http_server_connections_active{app="info-v1", host="ap-216t.xxx.net",local_address="service_y.test01.qaadr.local:8080",uri="/"} 5.0 reactor_netty_http_server_connections_active{app="info-v1", host="ap-216t.xxx.net",local_address="info-v1.test01.qaadr.local:443:80",uri="/"} 5.0

active count sometimes > total count

reactor_netty_http_server_connections_total{app="userinfo-service-v1",host="ap-espd219t.oneadr.net",local_address="192.168.201.21:8080",springBoot="2",uri="/",ver="1.3.0",} 5.0

reactor_netty_http_server_connections_active{app="userinfo-service-v1",host="ap-219t.xxx.net",local_address="192.168.201.21:8080",uri="/"} 10.0 reactor_netty_http_server_connections_active{app="userinfo-service-v1",host="ap-219t.xxx.net",local_address="userinfo-v1.test01.qaadr.local:80"uri="/"} 10.0 reactor_netty_http_server_connections_active{app="userinfo-service-v1",host="ap-219t.xxx.net",local_address="userinfo-v1.test01.qaadr.local:443:80"uri="/"} 10.0

jtorkkel avatar Sep 15 '22 06:09 jtorkkel

@jtorkkel Please provide some reproducible example that we can use in order to further investigate the issue.

violetagg avatar Sep 15 '22 13:09 violetagg

@jtorkkel I'm closing this, we can reopen it when you provide the requested information.

violetagg avatar Sep 27 '22 08:09 violetagg