beyla icon indicating copy to clipboard operation
beyla copied to clipboard

Extend beyla k8s-cache into DNS caching and resolution sharing

Open esara opened this issue 4 months ago • 1 comments

Problem

Beyla’s k8s-cache currently provides a centralized cache of Kubernetes API responses, such as Pods, Services, and Endpoints. This ensures that Beyla agents across different nodes do not each make redundant API calls, but instead share results from a single query. By doing so, it reduces the load on the Kubernetes API server while still allowing agents to access up-to-date metadata.

A natural extension of this idea is to incorporate DNS caching. In many cases, agents need to resolve external service IPs, especially when clients connect using raw IP addresses but observability requires human-readable DNS names. Without coordination, each Beyla agent may perform its own reverse DNS lookups, potentially overwhelming cluster or external DNS servers with redundant queries.

Opportunity

Introducing DNS-awareness into k8s-cache could help solve this problem. By capturing DNS lookups—either from system resolver calls or by observing DNS traffic at the node level—the cache could store and share results across all Beyla agents in the cluster. This way, a single resolution would serve multiple agents, ensuring consistent mappings between IPs and hostnames and avoiding unnecessary DNS traffic.

The benefits of such an approach are significant. It improves efficiency by avoiding repeated resolutions for the same IP, reduces the likelihood of DNS query storms under heavy workloads, and ensures consistency across agents so that they report the same DNS names for identical IPs. In high-scale environments with ephemeral workloads, this also reduces pressure on DNS servers and improves overall scalability.

Proposal

One possibility is DNS traffic observation using eBPF OBI feature request to capture queries and responses at the node level. Another is building a shared cache layer, perhaps as an extension of k8s-cache, that stores IP-to-hostname mappings and respects DNS record TTLs. A fallback mechanism could also be added, where if a cached result is not available, a designated resolver agent performs the lookup and shares the result with others.

esara avatar Sep 15 '25 02:09 esara

This is a very good idea, we are planning to start working on expanding the k8s cache to generic names cache.

grcevski avatar Sep 17 '25 13:09 grcevski