operator icon indicating copy to clipboard operation
operator copied to clipboard

prometheusOperator: true add, then delete scrape config continously

Open pasztorl opened this issue 1 year ago • 14 comments

I've added prometheusOperator: true to a tenant config then i just see this in the logs:

I0716 15:03:49.705015       1 prometheus.go:178] Deleting MinIO tenant Prometheus scrape config
I0716 15:03:52.595321       1 prometheus.go:120] Adding MinIO tenant Prometheus scrape config
I0716 15:03:54.742093       1 prometheus.go:178] Deleting MinIO tenant Prometheus scrape config
I0716 15:03:57.604896       1 prometheus.go:120] Adding MinIO tenant Prometheus scrape config
I0716 15:03:59.761052       1 prometheus.go:178] Deleting MinIO tenant Prometheus scrape config
I0716 15:04:02.631716       1 prometheus.go:120] Adding MinIO tenant Prometheus scrape config
I0716 15:04:04.771914       1 prometheus.go:178] Deleting MinIO tenant Prometheus scrape config

How can I debug the problem?

pasztorl avatar Jul 16 '24 15:07 pasztorl

@pasztorl could you please add your prometheus configuration yaml and crd. Thanks.

cesnietor avatar Jul 22 '24 16:07 cesnietor

Hi,

I set two things related to this:

Operator chart:

operator:
  env:
    - name: PROMETHEUS_NAMESPACE
      value: "monitoring"
    - name: PROMETHEUS_NAME
      value: "prometheus-prometheus"

The operator can find prometheus, because when it was not set properly I got different error messages.

On tenant:

spec:
  prometheusOperator: true

Prometheus installed with kube-prometheus-stack chart.

pasztorl avatar Jul 22 '24 17:07 pasztorl

Now I do a retry, now I got this message from the operator:

E0722 17:17:24.054527 1 main-controller.go:749] error syncing 'devel/minio': prometheus-prometheus-scrape-confg is alreay set as additional scrape config in prometheus

Which is true, because I use additional scrape config so the prometheus-prometheus-scrape-confg secret exists in the monitoring ns.

My scape config looks like this:

- job_name: node-exporter-jump
  static_configs:
  - targets:
    - jump1.int.hu1.example.com:9100

pasztorl avatar Jul 22 '24 17:07 pasztorl

could you please check if the secret is being created? minio-prom-additional-scrape-config can you also check if the contents of it are the same as your scrape config?

cesnietor avatar Jul 22 '24 18:07 cesnietor

I've deleted my own scrape config secret then I restarted the minio operator. Interesting, that I got the same error message while the secret not created by the minio operator. How the operator checks that is scrape config exists?

I've not found minio-prom-additional-scrape-config secret on any ns.

pasztorl avatar Jul 22 '24 18:07 pasztorl

please share your full tenant yaml as well as your operator .

cesnietor avatar Jul 22 '24 18:07 cesnietor

also please share the full operator logs after restart

cesnietor avatar Jul 22 '24 18:07 cesnietor

I've attached the tenant and the operator deployment operator-deployment.txt tenant.txt

pasztorl avatar Jul 22 '24 18:07 pasztorl

I've set the prometheusOperator: false. Restarted the operator, then after the restart I switched back to true. Here is the log. operatorlog.txt

pasztorl avatar Jul 22 '24 18:07 pasztorl

you need to remove the prometheus-prometheus-scrape-confg from the prometheus AdditionalScrapeConfigs field. we error out when something other than minio-prom-additional-scrape-config is set.

cesnietor avatar Jul 22 '24 18:07 cesnietor

Actually I using this AdditionalScrapeConfigs to monitor nodes that have node-exporter but not a node within the k8s cluster, so this is why I need the AdditionalScrapeConfigs. Maybe I can use the "Scrape Config" crd, also what do you think, minio operatod should use also the Scrape Config crd, not the secret? So the prometheus deployment is not dedicated to minio, I would like to use the cluster-wide deployment for it if possible.

pasztorl avatar Jul 22 '24 19:07 pasztorl

Right now it can only point to one secret, and minio needs that one I mentioned to work. and the key also has to be the same. To fix your case, rename your secret with your custom config to the one I mentioned. MinIO operator will be happy seeing that the name is the one it looks for and it will append the minio scrape config it needs. Let me know if that works.

cesnietor avatar Jul 22 '24 19:07 cesnietor

Hi @pasztorl,

I am using ScrapeConfig CR to add scrape config.

Starting with prometheus-operator v0.65.x, one can use the ScrapeConfig CRD to scrape targets external to the Kubernetes cluster or create scrape configurations that are not possible with the higher level ServiceMonitor/Probe/PodMonitor resources. (source: https://prometheus-operator.dev/docs/developer/scrapeconfig/)

I think creating ScrapeConfig CR is a better and neater way to add scrape config, instead of using MinIO operator to modify Prometheus CR, which could be managed by Helm/Terraform.

Here is my ScrapeConfig CR:

apiVersion: monitoring.coreos.com/v1alpha1
kind: ScrapeConfig
metadata:
  name: minio-scrapeconfig
  namespace: minio
  labels:
    release: kube-prometheus-stack
spec:
  jobName: my-test-job
  authorization:
    credentials:
      # the secret that stores the bearerToken got from `mc admin prometheus generate ALIAS`
      name: minio-scrapeconfig-secret
      key: bearerToken
    type: Bearer
  metricsPath: /minio/v2/metrics/cluster
  scheme: HTTPS
  tlsConfig:
    ca:
      secret:
        name: my-minio-tls
        key: public.crt
  staticConfigs:
    - targets:
        - minio.minio.svc.cluster.local:443

chriskhanhtran avatar Nov 22 '24 19:11 chriskhanhtran

@chriskhanhtran this is a good apprcoah!

Would be nice if this was done through the official tenant chart though. So hopefully minio guys will add it!

HummingMind avatar Jan 25 '25 22:01 HummingMind

Chiming in with the request to generate an extra ScrapeConfig CR instead of messing with the additionalScrapeConfig of the Prometheus CR. This can break installations where the prometheus CR is reconciled and kept in sync by GitOps means.

sfudeus avatar Apr 06 '25 11:04 sfudeus

Generating a unique ScrapeConfig per tenant would also solve the problems mentioned in #2425 .

nogweii avatar May 22 '25 01:05 nogweii

@pasztorl please retest whenever PR https://github.com/minio/operator/pull/2456 is merged. After these changes I was no longer able to see this issue.

JoelRuizRojas avatar May 27 '25 19:05 JoelRuizRojas

Thanks!

pasztorl avatar Jun 26 '25 13:06 pasztorl