buildkit icon indicating copy to clipboard operation
buildkit copied to clipboard

Add metrics endpoint for buildkitd

Open jsravn opened this issue 5 years ago • 7 comments

It would be great if buildkitd exposed metrics that can be used to monitor it and debug issues.

Some useful metrics:

  • Build cache size
  • Build statistics per client

jsravn avatar Jun 26 '20 13:06 jsravn

For per client statistics, it could use the client name in the client TLS certificate if used. Otherwise, maybe some metadata could be passed with the build request.

jsravn avatar Jun 26 '20 13:06 jsravn

I was thinking about this more in the context of a Kubernetes cluster. A problem for registering metrics is buildkit does not know about the context (who is doing the build), so the metrics will have to be quite general. Would be acceptable to add Kubernetes support, such that buildkitd could determine the calling pod information from the peer IP on the gRPC request?

jsravn avatar Jul 20 '20 08:07 jsravn

+1

alsterg avatar Dec 05 '22 10:12 alsterg

@jsravn is it correct to assume that you are implying Prometheus exposition format?

errordeveloper avatar Feb 06 '23 15:02 errordeveloper

@jsravn is it correct to assume that you are implying Prometheus exposition format?

Yes, or I suppose openmetrics these days.

jsravn avatar Feb 07 '23 10:02 jsravn

@jsravn Hi. Have you managed to find a workaround for this issue?

halvorstein avatar Jun 18 '24 13:06 halvorstein

I will second this. Just having basic information about what the pod is doing, like if I have max-par set to say 10... how many are actually get used over time so that I know if I am near capacity or vastly over. That would help me size correctly. Cache information as well. How often GC is running. Things like that so I can tune the pod and nodes and what not to avoid waste and/or not be the bottleneck.

RandellP avatar May 21 '25 21:05 RandellP

We would have liked some metrics / logs that could have caught the issue in https://github.com/moby/buildkit/issues/6131

We would like to catch those types of issues, before our users contact us about them.
("users" being developers in the company that use the company buildkit server we host)

Something like "average build runtime" or something. idk. Not an observability expert. Not sure which kinda metric would be best. But something we could scrape with Prometheus would be nice.

MalteMagnussen avatar Aug 12 '25 13:08 MalteMagnussen

@tonistiigi I see you mentioned on the Docker Community Slack that there is a metric endpoint in https://github.com/moby/buildkit/blob/master/cmd/buildkitd/debug.go#L49

Is there a doc on how to use it with prometheus? Does it require --debugaddr to be activated like explained here?

remidebette avatar Sep 11 '25 08:09 remidebette

I would really like this enhancement!

Lukas-solar avatar Oct 17 '25 13:10 Lukas-solar