pyroscope icon indicating copy to clipboard operation
pyroscope copied to clipboard

Better documentation/default support for fgprof

Open tina-junold opened this issue 3 years ago • 9 comments

Is your feature request related to a problem? Please describe.

I'd like to configure phlare to scrape fgprof like discribed here. But it failed (maybe due to a lack of scrape configuration skills of mine)

Describe the solution you'd like

It would be nice if the documentation would be clearer-

Additional context

Add any other context or screenshots about the feature request here.

values.yaml for helm chart:

phlare:
  replicaCount: 1
  persistence:
    enabled: true # false
    storageClassName: local-path
    accessModes:
      - ReadWriteOnce
    size: 10Gi

  structuredConfig:
    storage:
      backend: s3
      s3:
        access_key_id: xxx
        bucket_name: phlare
        endpoint: minio.minio.svc:9000
        secret_access_key: xxx
        insecure: true # undefined

    scrape_configs:

      - job_name: 'kubernetes-pods'
        scrape_interval: "15s"

        kubernetes_sd_configs:
          - role: pod

        relabel_configs:
          - source_labels: [__meta_kubernetes_pod_annotation_phlare_grafana_com_scrape]
            action: keep
            regex: true
          - source_labels: [__address__, __meta_kubernetes_pod_annotation_phlare_grafana_com_port]
            action: replace
            regex: (.+?)(?::\d+)?;(\d+)
            replacement: $1:$2
            target_label: __address__
          - action: labelmap
            regex: __meta_kubernetes_pod_label_(.+)
          - source_labels: [__meta_kubernetes_namespace]
            action: replace
            target_label: namespace
          - source_labels: [__meta_kubernetes_pod_name]
            action: replace
            target_label: pod
          - source_labels: [__meta_kubernetes_pod_phase]
            regex: Pending|Succeeded|Failed|Completed
            action: drop

      - job_name: 'default'
        profiling_config:
          pprof_config:
            fgprof:
              path: /debug/fgprof
              delta: true
              enabled: true

tina-junold avatar Nov 04 '22 15:11 tina-junold

Have you tried to move the profiling_config under the job job_name: 'kubernetes-pods' ? I think that's your problem.

cyriltovena avatar Nov 07 '22 13:11 cyriltovena

@cyriltovena Yes you are right, but this only replace the original scrapping, best would be if fgprof and pprof would be supported both, depend on their path.

This

annotations:
  linkerd.io/inject: enabled
  phlare.grafana.com/scrape: "true"
  phlare.grafana.com/port: "6060"

and this

annotations:
  linkerd.io/inject: enabled
  phlare.grafana.com/scrape: "true"
  phlare.grafana.com/port: "6060"
  phlare.grafana.com/path: "/debug/pprof"

should be scraped with the job pprof-pods

while this

annotations:
  linkerd.io/inject: enabled
  phlare.grafana.com/scrape: "true"
  phlare.grafana.com/port: "6060"
  phlare.grafana.com/path: "/debug/fgprof"

is scraped with fgprof-prods job:


    scrape_configs:

      - job_name: 'pprof-pods'
        scrape_interval: "15s"

        kubernetes_sd_configs:
          - role: pod

        relabel_configs:
          - source_labels: [__meta_kubernetes_pod_annotation_phlare_grafana_com_path]
            action: keep
            regex: (.+prof)?
          - source_labels: [__meta_kubernetes_pod_annotation_phlare_grafana_com_scrape]
            action: keep
            regex: true
          - source_labels: [__address__, __meta_kubernetes_pod_annotation_phlare_grafana_com_port]
            action: replace
            regex: (.+?)(?::\d+)?;(\d+)
            replacement: $1:$2
            target_label: __address__
          - action: labelmap
            regex: __meta_kubernetes_pod_label_(.+)
          - source_labels: [__meta_kubernetes_namespace]
            action: replace
            target_label: namespace
          - source_labels: [__meta_kubernetes_pod_name]
            action: replace
            target_label: pod
          - source_labels: [__meta_kubernetes_pod_phase]
            regex: Pending|Succeeded|Failed|Completed
            action: drop

      - job_name: 'fgprof-pods'
        scrape_interval: "15s"

        kubernetes_sd_configs:
          - role: pod

        relabel_configs:
          - source_labels: [__meta_kubernetes_pod_annotation_phlare_grafana_com_path]
            action: keep
            regex: .+fgprof.*
          - source_labels: [__meta_kubernetes_pod_annotation_phlare_grafana_com_scrape]
            action: keep
            regex: true
          - source_labels: [__address__, __meta_kubernetes_pod_annotation_phlare_grafana_com_port]
            action: replace
            regex: (.+?)(?::\d+)?;(\d+)
            replacement: $1:$2
            target_label: __address__
          - action: labelmap
            regex: __meta_kubernetes_pod_label_(.+)
          - source_labels: [__meta_kubernetes_namespace]
            action: replace
            target_label: namespace
          - source_labels: [__meta_kubernetes_pod_name]
            action: replace
            target_label: pod
          - source_labels: [__meta_kubernetes_pod_phase]
            regex: Pending|Succeeded|Failed|Completed
            action: drop

        profiling_config:
          pprof_config:
            fgprof:
              path: /debug/fgprof
              delta: true
              enabled: true

But my config seems to be wrong...

tina-junold avatar Nov 22 '22 16:11 tina-junold

What's wrong it seems good to me.

cyriltovena avatar Nov 30 '22 08:11 cyriltovena

@cyriltovena It's seems to be an issue in combination of linkerd, which injects a sidecar container which not have this paths:

phlare-0 phlare level=error caller=target.go:183 ts=2022-12-08T17:16:58.464947539Z msg="fetch profile failed" target="{__address__=\"10.42.0.35:6060\", __name__=\"goroutine\", __profile_path__=\"/debug/pprof/goroutine\", __scheme__=\"http\", app_kubernetes_io_name=\"echo\", app_kubernetes_io_part_of=\"example\", app_kubernetes_io_version=\"1.0.0\", instance=\"10.42.0.35:6060\", job=\"fgprof-pods\", linkerd_io_control_plane_ns=\"linkerd\", linkerd_io_proxy_deployment=\"echo\", linkerd_io_workload_ns=\"default\", namespace=\"default\", pod=\"echo-c455d9bf4-qnw68\", pod_template_hash=\"c455d9bf4\"}" err="server returned HTTP status (404) 404 page not found"
phlare-0 phlare level=error caller=target.go:183 ts=2022-12-08T17:16:58.505974319Z msg="fetch profile failed" target="{__address__=\"10.42.1.33:6060\", __name__=\"process_cpu\", __profile_path__=\"/debug/pprof/profile\", __scheme__=\"http\", app_kubernetes_io_name=\"foxtrot\", app_kubernetes_io_part_of=\"example\", app_kubernetes_io_version=\"1.0.0\", instance=\"10.42.1.33:6060\", job=\"fgprof-pods\", linkerd_io_control_plane_ns=\"linkerd\", linkerd_io_proxy_deployment=\"foxtrot\", linkerd_io_workload_ns=\"default\", namespace=\"default\", pod=\"foxtrot-db9c8b4cd-9bk5t\", pod_template_hash=\"db9c8b4cd\"}" err="server returned HTTP status (404) 404 page not found"
phlare-0 phlare level=error caller=target.go:183 ts=2022-12-08T17:16:58.610967703Z msg="fetch profile failed" target="{__address__=\"10.42.2.35:6060\", __name__=\"block\", __profile_path__=\"/debug/pprof/block\", __scheme__=\"http\", app_kubernetes_io_name=\"alfa\", app_kubernetes_io_part_of=\"example\", app_kubernetes_io_version=\"1.0.0\", instance=\"10.42.2.35:6060\", job=\"pprof-pods\", linkerd_io_control_plane_ns=\"linkerd\", linkerd_io_proxy_deployment=\"alfa\", linkerd_io_workload_ns=\"default\", namespace=\"default\", pod=\"alfa-68bcc5ff6b-94mhc\", pod_template_hash=\"68bcc5ff6b\"}" err="server returned HTTP status (404) 404 page not found"
phlare-0 phlare level=error caller=target.go:183 ts=2022-12-08T17:16:58.882963373Z msg="fetch profile failed" target="{__address__=\"10.42.2.36:6060\", __name__=\"mutex\", __profile_path__=\"/debug/pprof/mutex\", __scheme__=\"http\", app_kubernetes_io_name=\"delta\", app_kubernetes_io_part_of=\"example\", app_kubernetes_io_version=\"1.0.0\", instance=\"10.42.2.36:6060\", job=\"fgprof-pods\", linkerd_io_control_plane_ns=\"linkerd\", linkerd_io_proxy_deployment=\"delta\", linkerd_io_workload_ns=\"default\", namespace=\"default\", pod=\"delta-67b8764648-4gmxp\", pod_template_hash=\"67b8764648\"}" err="server returned HTTP status (404) 404 page not found"
phlare-0 phlare level=error caller=target.go:183 ts=2022-12-08T17:16:59.264856171Z msg="fetch profile failed" target="{__address__=\"10.42.0.35:6060\", __name__=\"process_cpu\", __profile_path__=\"/debug/pprof/profile\", __scheme__=\"http\", app_kubernetes_io_name=\"echo\", app_kubernetes_io_part_of=\"example\", app_kubernetes_io_version=\"1.0.0\", instance=\"10.42.0.35:6060\", job=\"fgprof-pods\", linkerd_io_control_plane_ns=\"linkerd\", linkerd_io_proxy_deployment=\"echo\", linkerd_io_workload_ns=\"default\", namespace=\"default\", pod=\"echo-c455d9bf4-qnw68\", pod_template_hash=\"c455d9bf4\"}" err="server returned HTTP status (404) 404 page not found"

i need to filter out requests to these containers

generally it seems to work but it produces a lot of error logs

tina-junold avatar Dec 08 '22 17:12 tina-junold

Yeah, I think we need to improve discovery.

cyriltovena avatar Jan 02 '23 08:01 cyriltovena

This is still an issue Today. Is there anyone working on it?

chuck-bear avatar Apr 11 '24 20:04 chuck-bear

I think pyroscope can not scrape anymore

grafana agent can

https://grafana.com/docs/agent/latest/flow/reference/components/pyroscope.scrape/#profilefgprof-block

korniltsev avatar Apr 12 '24 05:04 korniltsev

I think this is now addressed in the doc here: https://grafana.com/docs/pyroscope/latest/configure-client/grafana-agent/go_pull/#prepare-the-collector-configuration-file

knylander-grafana avatar May 16 '24 02:05 knylander-grafana

yeah I made it work by using Grafana agent.

chuck-bear avatar May 16 '24 14:05 chuck-bear