retina icon indicating copy to clipboard operation
retina copied to clipboard

Missing source, destination context with hubble_dns* metrics

Open bartwitkowski opened this issue 3 months ago • 1 comments

Describe the bug Both metrics hubble_dns_queries_total and hubble_dns_responses_total have empty destination and source labels.

To Reproduce

Install retina-hubble with:

VERSION=$( curl -sL https://api.github.com/repos/microsoft/retina/releases/latest | jq -r .name)
helm upgrade --install retina oci://ghcr.io/microsoft/retina/charts/retina-hubble \
		--version $VERSION \
		--namespace kube-system \
		--set cluster.name="aksclustername" \
		--set os.windows=false \
		--set operator.enabled=true \
		--set operator.enableRetinaEndpoint=true \
		--set operator.repository=ghcr.io/microsoft/retina/retina-operator \
		--set operator.tag=$VERSION \
		--set agent.enabled=true \
		--set agent.repository=ghcr.io/microsoft/retina/retina-agent \
		--set agent.tag=$VERSION \
		--set agent.init.enabled=true \
		--set agent.init.repository=ghcr.io/microsoft/retina/retina-init \
		--set agent.init.tag=$VERSION \
		--set logLevel=info \
		--set hubble.relay.tls.server.enabled=false \
		--set hubble.relay.prometheus.enabled=true \
		--set hubble.relay.prometheus.serviceMonitor.enabled=true \
		--set hubble.tls.enabled=false \
		--set hubble.tls.auto.enabled=false \
		--set hubble.tls.auto.method=cronJob \
		--set hubble.tls.auto.certValidityDuration=1 \
		--set hubble.tls.auto.schedule="*/10 * * * *" \
		--set hubble.metrics.serviceMonitor.enabled=true \
		--set prometheus.serviceMonitor.namespace=kube-system \
		--set enabledPlugin_linux="\[dropreason\,packetforward\,linuxutil\,dns\,packetparser\]" \
		--set enablePodLevel=true \
		--set remoteContext=true \
		--set enableAnnotations=true

To test, run curl http://retina_pod_ip:9965/metrics. Check both DNS metrics labels.

from retina-config:

  hubble-metrics: flow:sourceEgressContext=pod;destinationIngressContext=pod tcp:sourceEgressContext=pod;destinationIngressContext=pod
    dns:query;sourceEgressContext=pod;destinationIngressContext=pod drop:sourceEgressContext=pod;destinationIngressContext=pod

With above configuration destination or source labels are visible in hubble_tcp_flags_total or hubble_drop_total metrics. In AKS cluster with Azure CNI powered by Cilium both DNS metric labels are filled with pod names.

Or maybe it is not a bug, but I'm missing something?

Expected behavior hubble_dns_responses_total should have destination label filled with namespace/pod-name. hubble_dns_queries_total should have source label filled with namespace/pod-name.

Platform (please complete the following information):

  • OS: Ubuntu 22.04.5 LTS
  • Kubernetes Version: v1.31.6, v1.32.6
  • Host: AKS, Azure CNI with Calico polices (NOT Cilium), direct routing (not overlay)
  • Retina Version: v0.0.36

bartwitkowski avatar Sep 08 '25 15:09 bartwitkowski

Same in Azure CNI Overlay Powered by Cilium (with ACNS enabled) https://learn.microsoft.com/en-us/azure/aks/container-network-observability-metrics?tabs=Cilium#pod-level-metrics-hubble-metrics

Metric Label Exists
hubble_dns_queries_total cluster
hubble_dns_queries_total instance
hubble_dns_queries_total ips_returned
hubble_dns_queries_total job
hubble_dns_queries_total microsoft.resourceid
hubble_dns_queries_total qtypes
hubble_dns_queries_total source
hubble_dns_queries_total destination ❌ Missing, should contain CoreDNS <namespace>/<pod>
hubble_dns_responses_total cluster
hubble_dns_responses_total instance
hubble_dns_responses_total ips_returned
hubble_dns_responses_total job
hubble_dns_responses_total microsoft.resourceid
hubble_dns_responses_total qtypes
hubble_dns_responses_total source
hubble_dns_responses_total destination ❌ Missing, should contain the client <namespace>/<pod>

illrill avatar Nov 14 '25 10:11 illrill