telepresence
telepresence copied to clipboard
Telepresence Fails to Proxy Local Outbound Traffic to In-Cluster Kafka Broker behind Headless Service
Describe the bug
DNS successfully finds the pod IP of the single broker running in my kafka cluster (using the kafka-ephemeral-single strimzi example with useServiceDnsDomain: true set as described in #1722), verified using the dns module in my simple test process based on the example in KafkaJS's README, but attempts to connect to that broker time out. When I create a headful service to proxy requests to that pod and set advertisedHost to that service's fqdn, it works as expected. I'm running default minikube config on a darwin macbook pro. AFAICT the only input variable that changes between the failing and passing setups is headless/headful service; IP-wise, the services and pods are on different subnets, another differentiating factor between headless and headful services in this setup.
tl;dr: Telepresence + Strimzi Kafka Operator + Basic/Default Kafka manifest + darwin MBP + default minikube cluster = does not work without workarounds in addition to useServiceDnsDomain: true
To Reproduce Steps to reproduce the behavior:
Prepare environment
- use a MBP with m1 pro chip (darwin)
- install Docker Desktop
- create a minikube cluster with default config
-
brew install helm
-
brew install datawire/blackbird/telepresence
-
helm repo add strimzi https://strimzi.io/charts/
-
helm install --create-namespace --namespace strimzi --set "watchNamespaces={default}" strimzi strimzi/strimzi-kafka-operator
-
wget https://raw.githubusercontent.com/strimzi/strimzi-kafka-operator/0.31.1/examples/kafka/kafka-ephemeral-single.yaml
and modify as follows- set first listener configuration to
{"useServiceDnsDomain": true}
(this works around original issue described in #1722)
- set first listener configuration to
- plug in the bootstrap service fqdn as the sole item in the
brokers
array in the example here: https://github.com/tulios/kafkajs - run locally with telepresence intercepting an unrelated, isolated service in the same namespace
- optional: import and use the
dns
module to observe that DNS is resolving correctly but the IP is unreachable
Expected behavior locally running the trivial test program in KafkaJS's README w/ bootstrap fqdns configured correctly resolves DNS and is able to route to the broker, resulting in logging some messages until killing the process.
Versions (please complete the following information):
- telepresence v2.7.6
- macOS 12.6 on MBP (14-inch 2021) with M1 Pro chip
- minikube v1.27.1; k8s client version: v1.25.3; kustomize version: v4.5.7; k8s server version: v1.24.3
VPN-related bugs: n/a
Additional context
- creating a headful service in front of the single broker pod (and adjusting the Kafka resource definition to advertise that host) enables correct functioning
- pods IPs and headful service cluster IPs are on different subnets
workaround
- merge into Kafka resource first listener configuration
{ "brokers": [{"broker": 0, "advertisedHost": "my-cluster-kafka-broker-proxy.default.svc.cluster.local"}]}
- apply proxy resource
my-cluster-kafka-broker-proxy
seen below - plug in the bootstrap service fqdn as the sole item in the
brokers
array in the example here: https://github.com/tulios/kafkajs - run locally with telepresence intercepting an unrelated, isolated service in the same namespace
apiVersion: v1
kind: Service
metadata:
name: my-cluster-kafka-broker-proxy
spec:
ports:
- port: 9092
protocol: TCP
targetPort: 9092
selector:
app.kubernetes.io/instance: my-cluster
app.kubernetes.io/part-of: strimzi-my-cluster
strimzi.io/name: my-cluster-kafka
strimzi.io/pod-name: my-cluster-kafka-0
@cindymullins-dw can you explain why this is categorized a feature request as opposed to a bug report as it was intended? comments from maintainers as well as documentation suggest this should already work; if it's a known limitation, I'd suggest treating this gap as a documentation bug.
maintainers - please address this as a bug, not a feature. all documentation points to this use case being supported.