opentelemetry-collector-contrib `scrape_config

Component(s)

receiver/prometheus

What happened?

Description

I want to use scrape_config_files to add prometheus job. However, I found that the prometheus job defined in this way are not effective, even though the OTel configuration has already been applied. For details on applying, see applyConfig

Steps to Reproduce

1、add the scrape_config_files in the configuration of prometheus receiver, with the file path specified as scrape_files.yaml. 2、add the appropriate scrape_configs entries to scrape_files.yaml. 3、start OTel

Expected Result

The prometheus job in scrape_files.yaml are executing correctly.

Actual Result

The prometheus job does not work, and the OTel logs do not display the added job.

Collector version

v0.95

Environment information

Environment

OS: darwin/arm64 Compiler: go1.21.9

The same issue occurs when deployed on Kubernetes (k8s)

OpenTelemetry Collector configuration

##### otel.yaml
exporters:
  otlphttp/metric:
    metrics_endpoint: http://localhost:8080
    retry_on_failure:
      initial_interval: 5s
      max_interval: 30s
      max_elapsed_time: 300s
      multiplier: 2
      randomization_factor: 0.5
extensions:
  pprof:
  health_check:
    endpoint: 0.0.0.0:13133
  memory_ballast:
      size_mib: "256"
processors:
  batch/metrics:
    send_batch_size: 500
    send_batch_max_size: 500
    timeout: 5s
  memory_limiter:
    check_interval: 1s
    limit_mib: 1024
  cumulativetodelta:
receivers:
  prometheus:
    trim_metric_suffixes: false
    config:
      scrape_config_files:
        - /scrape_files.yaml
      scrape_configs:
        - job_name: 'otel-scrape-self-test'
          scrape_interval: 10s
          scrape_timeout: 10s
          metrics_path: '/metrics'
          static_configs:
            - targets: ['0.0.0.0:8888']
service:
  telemetry:
    metrics:
      level: detailed
      address: 0.0.0.0:8888
  extensions:
  - pprof
  - health_check
  - memory_ballast
  pipelines:
    metrics/prometheus:
      receivers:
      - prometheus
      processors:
      - memory_limiter
      - cumulativetodelta
      - batch/metrics
      exporters:
      - otlphttp/metric

##### scrape_files.yaml
scrape_configs:
- job_name: 'otel-scrape-k8s-apiserver'
  scrape_interval: 10s
  scrape_timeout: 10s
  body_size_limit: 50MB
  follow_redirects: true
  scheme: https
  metrics_path: /metrics
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  tls_config:
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    insecure_skip_verify: true
  kubernetes_sd_configs:
    - role: endpoints
      namespaces:
        names:
        - default
  relabel_configs:
    - source_labels: [__meta_kubernetes_service_name]
      separator: ;
      regex: kubernetes
      replacement: $1
      action: keep
    - action: replace
      target_label: otel_pod
      replacement: otel_1
- job_name: 'otel-scrape-self'
  scrape_interval: 10s
  scrape_timeout: 10s
  metrics_path: '/metrics'
  static_configs:
    - targets: ['0.0.0.0:9999']

Log output

2024-08-21T18:49:59.184+0800    info    [email protected]/service.go:143  Starting otelcontribcol...      {"Version": "0.95.0-dev", "NumCPU": 8}
2024-08-21T18:49:59.184+0800    info    extensions/extensions.go:34     Starting extensions...
2024-08-21T18:49:59.184+0800    info    extensions/extensions.go:37     Extension is starting...        {"kind": "extension", "name": "pprof"}
2024-08-21T18:49:59.185+0800    info    pprofextension/pprofextension.go:60     Starting net/http/pprof server  {"kind": "extension", "name": "pprof", "config": {"TCPAddr":{"Endpoint":"localhost:1777","DialerConfig":{"Timeout":0}},"BlockProfileFraction":0,"MutexProfileFraction":0,"SaveToFile":""}}
2024-08-21T18:49:59.185+0800    info    extensions/extensions.go:52     Extension started.      {"kind": "extension", "name": "pprof"}
2024-08-21T18:49:59.185+0800    info    extensions/extensions.go:37     Extension is starting...        {"kind": "extension", "name": "memory_ballast"}
2024-08-21T18:49:59.187+0800    info    [email protected]/memory_ballast.go:41   Setting memory ballast  {"kind": "extension", "name": "memory_ballast", "MiBs": 256}
2024-08-21T18:49:59.188+0800    info    extensions/extensions.go:52     Extension started.      {"kind": "extension", "name": "memory_ballast"}
2024-08-21T18:49:59.188+0800    info    extensions/extensions.go:37     Extension is starting...        {"kind": "extension", "name": "health_check"}
2024-08-21T18:49:59.188+0800    info    healthcheckextension/healthcheckextension.go:35 Starting health_check extension {"kind": "extension", "name": "health_check", "config": {"Endpoint":"0.0.0.0:13134","TLSSetting":null,"CORS":null,"Auth":null,"MaxRequestBodySize":0,"IncludeMetadata":false,"ResponseHeaders":null,"Path":"/","ResponseBody":null,"CheckCollectorPipeline":{"Enabled":false,"Interval":"5m","ExporterFailureThreshold":5}}}
2024-08-21T18:49:59.189+0800    warn    [email protected]/warning.go:42  Using the 0.0.0.0 address exposes this server to every network interface, which may facilitate Denial of Service attacks. Enable the feature gate to change the default and remove this warning.        {"kind": "extension", "name": "health_check", "documentation": "https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks", "feature gate ID": "component.UseLocalHostAsDefaultHost"}
2024-08-21T18:49:59.189+0800    info    extensions/extensions.go:52     Extension started.      {"kind": "extension", "name": "health_check"}
2024-08-21T18:49:59.190+0800    info    prometheusreceiver/metrics_receiver.go:240      Starting discovery manager      {"kind": "receiver", "name": "prometheus", "data_type": "metrics"}
2024-08-21T18:50:04.422+0800    info    prometheusreceiver/metrics_receiver.go:231      Scrape job added        {"kind": "receiver", "name": "prometheus", "data_type": "metrics", "jobName": "otel-scrape-self-test"}
2024-08-21T18:50:04.422+0800    info    healthcheck/handler.go:132      Health Check state change       {"kind": "extension", "name": "health_check", "status": "ready"}
2024-08-21T18:50:04.422+0800    info    [email protected]/service.go:169  Everything is ready. Begin running and processing data.
2024-08-21T18:50:04.422+0800    warn    localhostgate/featuregate.go:63 The default endpoints for all servers in components will change to use localhost instead of 0.0.0.0 in a future version. Use the feature gate to preview the new default.       {"feature gate ID": "component.UseLocalHostAsDefaultHost"}
2024-08-21T18:50:04.422+0800    info    prometheusreceiver/metrics_receiver.go:282      Starting scrape manager {"kind": "receiver", "name": "prometheus", "data_type": "metrics"}
2024-08-21T18:50:14.480+0800    info    exporterhelper/retry_sender.go:118      Exporting failed. Will retry the request after interval.        {"kind": "exporter", "data_type": "metrics", "name": "otlphttp/metric", "error": "failed to make an HTTP request: Post \"https://otel-inner.yuanfudao.biz/metric/otel/v1\": dial tcp: lookup otel-inner.yuanfudao.biz: no such host", "interval": "2.623812628s"}

Additional context

No response

Aug 21 '24 10:08 mike9421

Pinging code owners:

receiver/prometheus: @Aneurysm9 @dashpole

See Adding Labels via Comments if you do not have permissions to add labels yourself.

Aug 21 '24 10:08 github-actions[bot]

My best guess is that we don't apply the config to the discovery manager here: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/8477a83afdfc8750d471cdb5b4af2fb227bc8423/receiver/prometheusreceiver/metrics_receiver.go#L366

We iterate over cfg.ScrapeConfigs, rather than cfg.GetScrapeConfigs(), which incorporates configuration from scrape_config_files. We should update most usages of cfg.ScrapeConfigs to use the newer function

Aug 21 '24 16:08 dashpole

My best guess is that we don't apply the config to the discovery manager here:

https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/8477a83afdfc8750d471cdb5b4af2fb227bc8423/receiver/prometheusreceiver/metrics_receiver.go#L366

We iterate over cfg.ScrapeConfigs, rather than cfg.GetScrapeConfigs(), which incorporates configuration from scrape_config_files. We should update most usages of cfg.ScrapeConfigs to use the newer function

Thank you for your answer. It worked after adding cfg.ScrapeConfigs, _ = (*config.Config)(cfg).GetScrapeConfigs() before the code mentioned above.

The only drawback is that OTel cannot replace the environment variables in the scrape_files.yaml.

Aug 22 '24 06:08 mike9421

If this issue is still available, I'd be happy to work on a fix for that

Aug 27 '24 09:08 bacherfl

It is all yours @bacherfl. Please cc me on the PR and i'll review

Aug 27 '24 15:08 dashpole

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

receiver/prometheus: @Aneurysm9 @dashpole

See Adding Labels via Comments if you do not have permissions to add labels yourself.

Oct 28 '24 03:10 github-actions[bot]

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

receiver/prometheus: @Aneurysm9 @dashpole

See Adding Labels via Comments if you do not have permissions to add labels yourself.

Dec 30 '24 03:12 github-actions[bot]

opentelemetry-collector-contrib
opentelemetry-collector-contrib copied to clipboard

`scrape_config_files` doesn't work

Component(s)

What happened?

Description

Steps to Reproduce

Expected Result

Actual Result

Collector version

Environment information

Environment

OpenTelemetry Collector configuration

Log output

Additional context

opentelemetry-collector-contrib opentelemetry-collector-contrib copied to clipboard

`scrape_config_files` doesn't work

Component(s)

What happened?

Description

Steps to Reproduce

Expected Result

Actual Result

Collector version

Environment information

Environment

OpenTelemetry Collector configuration

Log output

Additional context

opentelemetry-collector-contrib
opentelemetry-collector-contrib copied to clipboard