prometheus-exporter-plugin-for-opensearch icon indicating copy to clipboard operation
prometheus-exporter-plugin-for-opensearch copied to clipboard

[Tutorial] Write complete tutorial on how to setup OpenSearch with the plugin in K8s and Prometheus craping it

Open lukas-vlcek opened this issue 1 year ago • 15 comments
trafficstars

There is a lack of complete tutorial about how to setup OpenSearch cluster with the plugin in K8s and have Prometheus craping the metric endpoint.

See: https://forum.opensearch.org/t/prometheus-not-able-to-scrape-metrics-on-pod/16908/

Idea: This setup flow should be part of plugin new release process or even the CI (?)

lukas-vlcek avatar Dec 05 '23 15:12 lukas-vlcek

Is there any progress in this task. I would like to use prometheus to scrape opensearch metrics and use Grafana dashboards to monitor

layavadi avatar Apr 08 '24 13:04 layavadi

This tutorial is very much needed, I've been though several attempts to get Prometheus to scrape an endpoint on Kubernetes with no success

smbambling avatar May 09 '24 14:05 smbambling

Just for the record the following is a Slack thread we had with @smbambling on this topic: https://opensearch.slack.com/archives/C051JEH8MNU/p1715262647976709

lukas-vlcek avatar May 09 '24 14:05 lukas-vlcek

I've attempted to configure a scrape endpoint for Proemtheus to OpenSearch _prometheus/metrics via two seperate methods.

Notes:

  • kube-prometheus-stack is used to deploy Prometheus, Grafana, etc
  • OpenSearch Helm chart is used to deploy OpenSearch
  • Additonal security configs ( ie internal user, bindings, index managemanet, etc. ) / index management is performed via a customer OpenSearch-Helper helm chart

Method 1: Static Prometheus configs

In this method I've modified the kube-prometheus-stack Helm value override in order to apply additional configs.

In the below values I've tested multiple different combintations of configs

  • only insecure_skip_verify: true no other tls_configs set
  • insecure_skip_verify: false with ca_file set
  • max_version: TLS12 both set and not set
  • cert_file + key_file both set and not set
prometheus:
  prometheusSpec:
    additionalScrapeConfigs:
      - job_name: opensearch-job
        metrics_path: /_prometheus/metrics
        scheme: https
        static_configs:
          - targets:
              - opensearch-localk3s-cl1-master.opensearch.svc.cluster.local:9200
        basic_auth:
          username: "admin"
          password: "myfakePW"
        tls_config:
          insecure_skip_verify: true
          max_version: TLS12
          ca_file: /etc/prometheus/secrets/my-internal-wildcard-my-tls-certs/ca.crt
          cert_file: /etc/prometheus/secrets/my-internal-wildcard-my-tls-certs/tls.crt
          key_file: /etc/prometheus/secrets/my-internal-wildcard-my-tls-certs/tls.key

From another pod within the monitoring namespace where Prometheus ( no curl installed in the Prom container ) is running. I'm able to curl the internal service DNS name set above.

--- with referencing the CA cert
$ curl -XGET --cacert /tmp/foo -u 'admin:myfakePW' 'https://opensearch-localk3s-cl1-master.opensearch.svc.cluster.local:9200/_prometheus/metrics' | head
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0# HELP opensearch_jvm_mem_pool_max_bytes Maximum usage of memory pool
# TYPE opensearch_jvm_mem_pool_max_bytes gauge
opensearch_jvm_mem_pool_max_bytes{cluster="opensearch-localk3s-cl1",node="opensearch-localk3s-cl1-master-2",nodeid="7eGuaMZwTcKZYLfPDnovDA",pool="survivor",} 0.0


AND

--- without referencing the CA cert
$ curl -k -u 'admin:tes+1Passw*rd2' 'https://opensearch-localk3s-cl1-master.opensearch.svc.cluster.local:9200/_prometheus/metrics' | head
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0# HELP opensearch_indices_get_count Count of get commands
# TYPE opensearch_indices_get_count gauge
opensearch_indices_get_count{cluster="opensearch-localk3s-cl1",node="opensearch-localk3s-cl1-master-2",nodeid="7eGuaMZwTcKZYLfPDnovDA",} 0.0
opensearch_indices_get_count{cluster="opensearch-localk3s-cl1",node="opensearch-localk3s-cl1-hot-data-0",nodeid="-Modhwt_TMiOd4f4rSSPhg",} 48.0

smbambling avatar May 09 '24 14:05 smbambling

I've attempted to configure a scrape endpoint for Proemtheus to OpenSearch _prometheus/metrics via two seperate methods.

Notes:

  • kube-prometheus-stack is used to deploy Prometheus, Grafana, etc
  • OpenSearch Helm chart is used to deploy OpenSearch
  • Additonal security configs ( ie internal user, bindings, index managemanet, etc. ) / index management is performed via a customer OpenSearch-Helper helm chart

Method 2: Using Prometheus Service Monitor

In this method I've created a servicemonitor for kube-prometheus-stack to read and generate scrape targets.

Below is the output for my created servicemonitor

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  annotations:
    meta.helm.sh/release-name: opensearch-master
    meta.helm.sh/release-namespace: opensearch
  creationTimestamp: "2024-05-08T14:51:02Z"
  generation: 12
  labels:
    app.kubernetes.io/component: opensearch-localk3s-cl1-master
    app.kubernetes.io/instance: opensearch-master
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: opensearch
    app.kubernetes.io/version: 2.11.1
    helm.sh/chart: opensearch-2.17.0
    release: kube-prometheus-stack
  name: opensearch-service-monitor
  namespace: monitoring
  resourceVersion: "141672"
  uid: cf1df5d5-a855-4eb1-8cb5-da2ddaad99f6
spec:
  endpoints:
  - basicAuth:
      password:
        key: password
        name: opensearch-service-monitor-basic-auth
      username:
        key: username
        name: opensearch-service-monitor-basic-auth
    interval: 10s
    path: /_prometheus/metrics
    port: http
    scheme: https
    tlsConfig:
      ca: {}
      insecureSkipVerify: true
  namespaceSelector:
    matchNames:
    - opensearch
  selector:
    matchLabels:
      app.kubernetes.io/component: opensearch-localk3s-cl1-master
      app.kubernetes.io/instance: opensearch-master
      app.kubernetes.io/managed-by: Helm
      app.kubernetes.io/name: opensearch
      app.kubernetes.io/version: 2.11.1
      helm.sh/chart: opensearch-2.17.0

Again multiple different combintations of configs were tested within the servicemonitor which proivded the same end result. Where the scrape endpoints are created but there is an SSL handshake issue for Prometheus

Just as verification I could also curl from the same pod in method 1 to the cluster IP endpoints generated via the servicemonitor

$ curl -u 'admin:myfakePW' -k https://10.42.0.69:9200/_prometheus/metrics | head
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0# HELP opensearch_indices_refresh_total_time_seconds Time spent while refreshes
# TYPE opensearch_indices_refresh_total_time_seconds gauge
opensearch_indices_refresh_total_time_seconds{cluster="opensearch-localk3s-cl1",node="opensearch-localk3s-cl1-master-2",nodeid="7eGuaMZwTcKZYLfPDnovDA",} 0.0
opensearch_indices_refresh_total_time_seconds{cluster="opensearch-localk3s-cl1",node="opensearch-localk3s-cl1-hot-data-0",nodeid="-Modhwt_TMiOd4f4rSSPhg",} 174.781

In the end both methods produce the following errors in the Prometheus UI

Screenshot 2024-05-09 at 10 13 11 AM

 

smbambling avatar May 09 '24 14:05 smbambling

Thanks @smbambling for putting the effort into write it all down.

lukas-vlcek avatar May 09 '24 15:05 lukas-vlcek

In our testing setup we had limiting ciphers in plugins.security.ssl.transport.enabled_ciphers, commenting this out allowed Prometheus to scrape the endpoints and gather data.

smbambling avatar May 10 '24 09:05 smbambling

i want to ask something, does this meas the opensearch provide the metrics data to prome? or prome provide the metrics data to opensearch?

rarifz avatar Jun 03 '24 06:06 rarifz

@rarifz This installs an exporter that exposes metrics about OpenSearch that Prometheus can be configured to scrape

smbambling avatar Jun 21 '24 10:06 smbambling

hello @smbambling, have you found a workaround? I tried with curl , it worked. But prometheus can not scrape metrics from this path /_prometheus/metrics FYI, other people use prometheus can scrape if setup cluster only using http protocol.

PDCuong avatar Oct 17 '24 09:10 PDCuong

Hello @smbambling, do we have any workaround for people using HTTPS with basic auth enabled? We see that it's working with curl, but Prometheus cannot scrape metrics from the /_prometheus/metrics path & it shows down.

aravindhkudiyarasan avatar Oct 17 '24 17:10 aravindhkudiyarasan