hubble-ui
hubble-ui copied to clipboard
Hubble UI doesn't work, fresh Cilium 1.12.1 install, "Data stream has failed on the UI backend: EOF"
I just today reinstalled Cilium in my bare-metal cluster at home. I installed 1.12.1, I did cilium hubble enable --ui, all went well, I open http://localhost:12000 in my browser and I see this:
The page stays like this indefinitely, accumulating more and more GetEvents calls:
In the browser console I see the following:
Uncertain how to proceed with debugging. Any help would be appreciated.
I just attempted upgrading to 1.13.0-rc0 and I experience the same problem.
Hi i want to work on this issue. Please assign this issue to me @samwho @gandro @rolinh
Having the same issue running on minikube. No flows are registering in the UI or via the hubble CLI utility
Hi i want to work on this issue. Please assign this issue to me @samwho @gandro @rolinh
We're not yet sure what the root cause is. If you know it, please feel free to share or fix. Otherwise I think we need more info.
What response headers do you see in the browser network tab @samwho ?
EOF error comes from hubble-relay. We need hubble-ui pod backend container logs and hubble-relay pod logs
Can confirm same issue is on 1.12.2; K8S 1.25.2, ARM/PI4 architecture
relay logs:
level=info msg="Starting gRPC server..." options="{peerTarget:hubble-peer.kube-system.svc.cluster.local:443 dialTimeout:5000000000 retryTimeout:30000000000 listenAddress::4245 metricsListenAddress: log:0x400037c2a0 serverTLSConfig:<nil> insecureServer:true clientTLSConfig:0x40000acbe8 clusterName:default insecureClient:false observerOptions:[0xbfc7d0 0xbfc8d0] grpcMetrics:<nil> grpcUnaryInterceptors:[] grpcStreamInterceptors:[]}" subsys=hubble-relay
level=warning msg="Failed to create peer client for peers synchronization; will try again after the timeout has expired" error="context deadline exceeded" subsys=hubble-relay target="hubble-peer.kube-system.svc.cluster.local:443"
level=warning msg="Failed to create peer client for peers synchronization; will try again after the timeout has expired" error="context deadline exceeded" subsys=hubble-relay target="hubble-peer.kube-system.svc.cluster.local:443"
level=warning msg="Failed to create peer client for peers synchronization; will try again after the timeout has expired" error="context deadline exceeded" subsys=hubble-relay target="hubble-peer.kube-system.svc.cluster.local:443"
level=warning msg="Failed to create peer client for peers synchronization; will try again after the timeout has expired" error="context deadline exceeded" subsys=hubble-relay target="hubble-peer.kube-system.svc.cluster.local:443"
level=warning msg="Failed to create peer client for peers synchronization; will try again after the timeout has expired" error="context deadline exceeded" subsys=hubble-relay target="hubble-peer.kube-system.svc.cluster.local:443"
level=warning msg="Failed to create peer client for peers synchronization; will try again after the timeout has expired" error="context deadline exceeded" subsys=hubble-relay target="hubble-peer.kube-system.svc.cluster.local:443"
level=warning msg="Failed to create peer client for peers synchronization; will try again after the timeout has expired" error="context deadline exceeded" subsys=hubble-relay target="hubble-peer.kube-system.svc.cluster.local:443"
level=warning msg="Failed to create peer client for peers synchronization; will try again after the timeout has expired" error="context deadline exceeded" subsys=hubble-relay target="hubble-peer.kube-system.svc.cluster.local:443"
level=warning msg="Failed to create peer client for peers synchronization; will try again after the timeout has expired" error="context deadline exceeded" subsys=hubble-relay target="hubble-peer.kube-system.svc.cluster.local:443"
level=warning msg="Failed to create peer client for peers synchronization; will try again after the timeout has expired" error="context deadline exceeded" subsys=hubble-relay target="hubble-peer.kube-system.svc.cluster.local:443"
level=warning msg="Failed to create peer client for peers synchronization; will try again after the timeout has expired" error="context deadline exceeded" subsys=hubble-relay target="hubble-peer.kube-system.svc.cluster.local:443"
level=warning msg="Failed to create peer client for peers synchronization; will try again after the timeout has expired" error="context deadline exceeded" subsys=hubble-relay target="hubble-peer.kube-system.svc.cluster.local:443"
level=warning msg="Failed to create peer client for peers synchronization; will try again after the timeout has expired" error="context deadline exceeded" subsys=hubble-relay target="hubble-peer.kube-system.svc.cluster.local:443"
level=warning msg="Failed to create peer client for peers synchronization; will try again after the timeout has expired" error="context deadline exceeded" subsys=hubble-relay target="hubble-peer.kube-system.svc.cluster.local:443"
level=warning msg="Failed to create peer client for peers synchronization; will try again after the timeout has expired" error="context deadline exceeded" subsys=hubble-relay target="hubble-peer.kube-system.svc.cluster.local:443"
level=warning msg="Failed to create peer client for peers synchronization; will try again after the timeout has expired" error="context deadline exceeded" subsys=hubble-relay target="hubble-peer.kube-system.svc.cluster.local:443"
level=warning msg="Failed to create peer client for peers synchronization; will try again after the timeout has expired" error="context deadline exceeded" subsys=hubble-relay target="hubble-peer.kube-system.svc.cluster.local:443"
Logs of backend container in UI:
level=info msg="running hubble status checker\n" subsys=ui-backend
level=info msg="fetching hubble flows: connecting to hubble-relay (attempt #1)\n" subsys=ui-backend
level=info msg="hubble-relay grpc client created (hubble-relay addr: hubble-relay:80)\n" subsys=ui-backend
level=info msg="hubble status checker: connection to hubble-relay established\n" subsys=ui-backend
level=info msg="hubble-relay grpc client created (hubble-relay addr: hubble-relay:80)\n" subsys=ui-backend
level=info msg="fetching hubble flows: connection to hubble-relay established\n" subsys=ui-backend
level=info msg="fetching hubble flows: connecting to hubble-relay (attempt #1)\n" subsys=ui-backend
level=error msg="flow error: EOF\n" subsys=ui-backend
level=info msg="hubble status checker: stopped\n" subsys="ui-backend:status-checker"
level=info msg="hubble-relay grpc client created (hubble-relay addr: hubble-relay:80)\n" subsys=ui-backend
level=error msg="fetching hubble flows: connecting to hubble-relay (attempt #1) failed: rpc error: code = Canceled desc = context canceled\n" subsys=ui-backend
level=info msg="fetching hubble flows: stream (ui backend <-> hubble-relay) is closed\n" subsys=ui-backend
level=info msg="Get flows request: number:10000 follow:true blacklist:{source_label:\"reserved:unknown\" source_label:\"reserved:host\" source_label:\"k8s:k8s-app=kube-dns\" source_label:\"reserved:remote-node\" source_label:\"k8s:app=prometheus\" source_label:\"reserved:kube-apiserver\"} blacklist:{destination_label:\"reserved:unknown\" destination_label:\"reserved:host\" destination_label:\"reserved:remote-node\" destination_label:\"k8s:app=prometheus\" destination_label:\"reserved:kube-apiserver\"} blacklist:{destination_label:\"k8s:k8s-app=kube-dns\" destination_port:\"53\"} blacklist:{source_fqdn:\"*.cluster.local*\"} blacklist:{destination_fqdn:\"*.cluster.local*\"} blacklist:{protocol:\"ICMPv4\"} blacklist:{protocol:\"ICMPv6\"} whitelist:{source_pod:\"default/\" event_type:{type:1} event_type:{type:4} event_type:{type:129} reply:false} whitelist:{destination_pod:\"default/\" event_type:{type:1} event_type:{type:4} event_type:{type:129} reply:false}" subsys=ui-backend
level=info msg="running hubble status checker\n" subsys=ui-backend
level=info msg="fetching hubble flows: connecting to hubble-relay (attempt #1)\n" subsys=ui-backend
level=info msg="hubble-relay grpc client created (hubble-relay addr: hubble-relay:80)\n" subsys=ui-backend
level=info msg="hubble-relay grpc client created (hubble-relay addr: hubble-relay:80)\n" subsys=ui-backend
level=info msg="hubble status checker: connection to hubble-relay established\n" subsys=ui-backend
level=info msg="fetching hubble flows: connection to hubble-relay established\n" subsys=ui-backend
level=info msg="fetching hubble flows: connecting to hubble-relay (attempt #1)\n" subsys=ui-backend
level=error msg="flow error: EOF\n" subsys=ui-backend
level=info msg="hubble status checker: stopped\n" subsys="ui-backend:status-checker"
level=info msg="hubble-relay grpc client created (hubble-relay addr: hubble-relay:80)\n" subsys=ui-backend
level=error msg="fetching hubble flows: connecting to hubble-relay (attempt #1) failed: rpc error: code = Canceled desc = context canceled\n" subsys=ui-backend
level=info msg="fetching hubble flows: stream (ui backend <-> hubble-relay) is closed\n" subsys=ui-backend
Hope this helps.
Kind regards,
Pascal
@pascal71 Your issue may be a different one. Would you mind opening a new issue?
Get same error asin in first comment and the same logs as Pascal. Using cilium v1.11.8 and just downloaded latest hubble and cilium binaries today. Here is the relay:
level=warning msg="Failed to create peer client for peers synchronization; will try again after the timeout has expired" error="context deadline exceeded" subsys=hubble-relay target="hubble-peer.kube-system.svc.cluster.local:443"
hubble-ui has:
level=info msg="Get flows request: number:10000 follow:true blacklist:{source_label:\"reserved:unknown\" source_label:\"reserved:host\" source_label:\"k8s:k8s-app=kube-dns\" source_label:\"reserved:remote-node\" source_label:\"k8s:app=prometheus\" source_label:\"reserved:kube-apiserv
level=info msg="running hubble status checker\n" subsys=ui-backend
level=info msg="fetching hubble flows: connecting to hubble-relay (attempt #1)\n" subsys=ui-backend
level=info msg="hubble-relay grpc client created (hubble-relay addr: hubble-relay:80)\n" subsys=ui-backend
level=info msg="hubble status checker: connection to hubble-relay established\n" subsys=ui-backend
level=info msg="hubble-relay grpc client created (hubble-relay addr: hubble-relay:80)\n" subsys=ui-backend
level=info msg="fetching hubble flows: connection to hubble-relay established\n" subsys=ui-backend
level=info msg="fetching hubble flows: connecting to hubble-relay (attempt #1)\n" subsys=ui-backend
level=error msg="flow error: EOF\n" subsys=ui-backend
level=info msg="hubble status checker: stopped\n" subsys="ui-backend:status-checker"
level=info msg="hubble-relay grpc client created (hubble-relay addr: hubble-relay:80)\n" subsys=ui-backend
level=error msg="fetching hubble flows: connecting to hubble-relay (attempt #1) failed: rpc error: code = Canceled desc = context canceled\n" subsys=ui-backend
level=info msg="fetching hubble flows: stream (ui backend <-> hubble-relay) is closed\n" subsys=ui-backend
FYI: I am using cillium since quite some time and each time I update I am trying it and it did not work a single time. The errors are getting less, but it would be useful to actually get back to these reports and suggest how users could help to actually get this to work.
Hi, I have the same issue on my k8s env, have you found a fix to resole this issue ?
Same issue here.
Hello,
Same issue on a rke2 cluster
- conf:
kind: HelmChartConfig metadata: name: rke2-cilium namespace: kube-system spec: valuesContent: |- hubble: listenAddress: ":4245" enabled: true metrics: enabled: - dns:query;ignoreAAAA - drop - tcp - flow - port-distribution - icmp - http peerService: clusterDomain: cluster.local relay: enabled: true ui: enabled: true tls: enabled: false
-
Cilium status: Defaulted container "cilium-agent" out of: cilium-agent, install-portmap-cni-plugin (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init) KVStore: Ok Disabled Kubernetes: Ok 1.23 (v1.23.14+rke2r1) [linux/amd64] Kubernetes APIs: ["cilium/v2::CiliumClusterwideNetworkPolicy", "cilium/v2::CiliumEndpoint", "cilium/v2::CiliumNetworkPolicy", "cilium/v2::CiliumNode", "core/v1::Namespace", "core/v1::Node", "core/v1::Pods", "core/v1::Service", "discovery/v1::EndpointSlice", "networking.k8s.io/v1::NetworkPolicy"] KubeProxyReplacement: Disabled
Host firewall: Disabled CNI Chaining: portmap Cilium: Ok 1.12.3 (v1.12.3-1c466d2) NodeMonitor: Listening for events on 2 CPUs with 64x4096 of shared memory Cilium health daemon: Ok
IPAM: IPv4: 7/254 allocated from 10.42.0.0/24, BandwidthManager: Disabled Host Routing: Legacy Masquerading: IPTables [IPv4: Enabled, IPv6: Disabled] Controller Status: 38/38 healthy Proxy Status: OK, ip 10.42.0.152, 0 redirects active on ports 10000-20000 Global Identity Range: min 256, max 65535 Hubble: Ok Current/Max Flows: 4095/4095 (100.00%), Flows/s: 505.43 Metrics: Ok Encryption: Disabled Cluster health: 3/3 reachable (2023-01-05T13:48:39Z) -
logs on hubble-relay:
level=warning msg="Failed to create peer client for peers synchronization; will try again after the timeout has expired" error="context deadline exceeded" subsys=hubble-relay target="hubble-peer.kube-system.svc.cluster.local:80"
- logs on hubble GUI:
Data stream has failed on the UI backend: EOF
- Port 4244 is open on nodes
Hello,
Same issue on a rke2 cluster
- conf:
kind: HelmChartConfig
metadata:
name: rke2-cilium
namespace: kube-system
spec:
valuesContent: |-
hubble: listenAddress: ":4245" enabled: true metrics: enabled: - dns:query;ignoreAAAA - drop - tcp - flow - port-distribution - icmp - http peerService: clusterDomain: cluster.local relay: enabled: true ui: enabled: true tls: enabled: false
- Cilium status:
Defaulted container "cilium-agent" out of: cilium-agent, install-portmap-cni-plugin (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init)
KVStore: Ok Disabled
Kubernetes: Ok 1.23 (v1.23.14+rke2r1) [linux/amd64]
Kubernetes APIs: ["cilium/v2::CiliumClusterwideNetworkPolicy", "cilium/v2::CiliumEndpoint", "cilium/v2::CiliumNetworkPolicy", "cilium/v2::CiliumNode", "core/v1::Namespace", "core/v1::Node", "core/v1::Pods", "core/v1::Service", "discovery/v1::EndpointSlice", "networking.k8s.io/v1::NetworkPolicy"]
KubeProxyReplacement: Disabled
Host firewall: Disabled
CNI Chaining: portmap
Cilium: Ok 1.12.3 (v1.12.3-1c466d2)
NodeMonitor: Listening for events on 2 CPUs with 64x4096 of shared memory
Cilium health daemon: Ok
IPAM: IPv4: 7/254 allocated from 10.42.0.0/24,
BandwidthManager: Disabled
Host Routing: Legacy
Masquerading: IPTables [IPv4: Enabled, IPv6: Disabled]
Controller Status: 38/38 healthy
Proxy Status: OK, ip 10.42.0.152, 0 redirects active on ports 10000-20000
Global Identity Range: min 256, max 65535
Hubble: Ok Current/Max Flows: 4095/4095 (100.00%), Flows/s: 505.43 Metrics: Ok
Encryption: Disabled
Cluster health: 3/3 reachable (2023-01-05T13:48:39Z)
- logs on hubble-relay:
level=warning msg="Failed to create peer client for peers synchronization; will try again after the timeout has expired" error="context deadline exceeded" subsys=hubble-relay target="hubble-peer.kube-system.svc.cluster.local:80"
- logs on hubble GUI:
Data stream has failed on the UI backend: EOF
- Port 4244 is open on nodes
I think it's related to TLS (surprise!). I turned it off completely on relay, ui, and cilium configs and it started working.
I think it's related to TLS (surprise!). I turned it off completely on relay, ui, and cilium configs and it started working.
Nice ! i already disable TLS but only on hubble side, i will looking on the others parts, Thanks
I get the exact same issue, was also able to fix it temporary by disabling TLS which seems like a bad idea.
I get the exact same issue
Any updates? I went into exactly the same.
This looks like not purely hubble-ui issue, but hubble/cilium issue. Usually it indicates, for example, when cilium installation was happened via helm, but hubble was enabled with cilium-cli. I would suggest to open new issue in https://github.com/cilium/cilium with detailed description how things were deployed. Please ref this issue there.
this also happens when cilium and hubble were both installed at the same time using helm .. so this does not need a new issue.
If you have enabled Traefik dashboard, try to disable it.
Same issue here, RKE2 with Cilium.
Has anyone found a nice way to resolve this issue without removing and reinstalling a Helm chart?
Have the same issue with Cilium 1.13 3 on Upstream K8s. Everything is installed with Helm. Disabling TLS fixed it also for me.
tls:
enabled: false
In my case the same error was happening with httpV2 enabled. Removing that line fixed the issue.
metrics:
serviceMonitor:
enabled: true
enableOpenMetrics: true
enabled:
- dns:query;ignoreAAAA
- drop
- tcp
- flow
- port-distribution
- icmp
- http
# - httpV2:exemplars=true;labelsContext=source_ip\,source_namespace\,source_workload\,destination_ip\,destination_namespace\,destination_workload\,traffic_direction
I managed to reproduce the Hubble CLI issue as well, but could not fix it by disabling TLS (example below). Chart reinstall helped though.
This did not help:
hubble:
relay:
tls:
server:
enabled: false
tls:
enabled: false
For me it was because my cluster domain is "cluster" (imperative for Cilium to not have doted cluster domain) BUT helm chart define by default hubble.peerService.clusterDomain to "cluster.local".
With Cilium installed with helm on 1.13.3, set correct hubble.peerService.clusterDomain value fix access to UI for me and I didn't need to disable TLS anywhere.
My Cilium values:
helm install cilium cilium/cilium --version 1.13.3 \
--namespace kube-system \
--set ipam.mode=cluster-pool \
--set ipam.operator.clusterPoolIPv4PodCIDRList=10.66.0.0/16 \
--set ipam.operator.clusterPoolIPv4MaskSize=20 \
--set kubeProxyReplacement=strict \
--set k8sServiceHost=172.16.66.200 \
--set k8sServicePort=6443 \
--set hubble.relay.enabled=true \
--set hubble.ui.enabled=true \
--set operator.replicas=1 \
--set tunnel=disabled \
--set ipv4NativeRoutingCIDR=10.66.0.0/16 \
--set autoDirectNodeRoutes=true \
--set hubble.peerService.clusterDomain=cluster
In my case, the following configuration alone was not enough, the communication from hubble-relay to hubble-peer was failing due to Ubuntu's ufw.
I allowed access from Cilium's IP CIDR and it worked fine.
hubble:
relay:
tls:
server:
enabled: false
tls:
enabled: false
you must visit http://localhost:12000/ because ui seems forbid outside ip, are you visit localhost?
The CLI simply doesn't support TLS being disabled.
When the following flags are issued:
cilium hubble enable --ui --helm-set hubble.tls.enabled=false --helm-set hubble.tls.auto.enabled=false --helm-set hubble.relay.tls.server.enabled=false
-
It causes the relay Secret not to be generated. This is what we want.
-
Secret creation is forced in the CLI regardless of (1).
func (k *K8sHubble) enableRelay(ctx context.Context) (string, error) {
...
k.Log("✨ Generating certificates...")
if err := k.createRelayCertificates(ctx); err != nil {
return "", err
}
...
}
func (k *K8sHubble) createRelayCertificates(ctx context.Context) error {
k.Log("🔑 Generating certificates for Relay...")
...
return k.createRelayClientCertificate(ctx)
}
func (k *K8sHubble) createRelayClientCertificate(ctx context.Context) error {
secret, err := k.generateRelayCertificate(defaults.RelayClientSecretName)
if err != nil {
return err
}
_, err = k.client.CreateSecret(ctx, secret.GetNamespace(), &secret, metav1.CreateOptions{})
if err != nil {
return fmt.Errorf("unable to create secret %s/%s: %w", secret.GetNamespace(), secret.GetName(), err)
}
return nil
}
secret is empty because of (1).
k.client.CreateSecret fails because it's called with empty "payload" (the empty secret).
For people who would like to enable httpV2 hubble metric. Try to remove \ in the labelsContext separator
I think the document was typo
metrics:
serviceMonitor:
enabled: true
enableOpenMetrics: true
enabled:
- dns:query;ignoreAAAA
- drop
- tcp
- flow
- port-distribution
- icmp
- http
- httpV2:exemplars=true;labelsContext=source_ip,source_namespace,source_workload,destination_ip,destination_namespace,destination_workload,traffic_direction