website icon indicating copy to clipboard operation
website copied to clipboard

Subject: Communication Issue Between Viz and External Prometheus

Open JadKHaddad opened this issue 2 years ago • 7 comments

Problem: In the documentation explaining the usage of an external Prometheus, the protocol is missing in provided example. This is causing Viz to be unable to establish communication with the external Prometheus, resulting in a "unsupported protocol" error message in the logs.

Solution: added protocol and basic auth to prometheus url

Fixes #1658

JadKHaddad avatar Aug 20 '23 11:08 JadKHaddad

@JadKHaddad, many thanks! and many apologies for letting this linger. 🤦‍♂️ This looks nice! I've updated this to the latest main, but I have two other requests:

  1. Can you also show an example without the username and password? presumably it's not always necessary.
  2. Can you copy this into the 2.14 directory, too?

Thank you! and, again, I'm sorry for the delay.

kflynn avatar Nov 01 '23 15:11 kflynn

@kflynn I'm worried about the heartbeat because it only looks for a Prometheus job called kubernetes-nodes-cadvisor If someone's using a different job name, like cadvisor, it won't work. Should we add the job to the docs?

JadKHaddad avatar Nov 12 '23 20:11 JadKHaddad

The example you added looks great, thanks! I think adding the job also sounds like a great idea.

kflynn avatar Nov 13 '23 16:11 kflynn

Oh, whoops – any chance you can fix the DCO for your latest, too? 😅

kflynn avatar Nov 13 '23 16:11 kflynn

whoops :P

Before adding the job to the docs, I want to test the entire workflow again on a fresh Kubernetes cluster. My latest tests were with Linkerd v2.13.

JadKHaddad avatar Nov 13 '23 19:11 JadKHaddad

@JadKHaddad Any joy? 🙂

kflynn avatar Dec 01 '23 22:12 kflynn

Hi @kflynn. In version stable-2.14.5, which I'm currently using, the PrometheusUrl appears to be hardcoded (see values).

In contrast, on the main branch, it is variable and can be specified through the control-plane Helm chart (see values).

We can provide all sorts of workarounds in the documentation, like deploying a named service in k8s, but I think this is not what we want to do. Optimally, we wait for the new release which should fix this issue.

JadKHaddad avatar Dec 03 '23 11:12 JadKHaddad