bazel-buildfarm icon indicating copy to clipboard operation
bazel-buildfarm copied to clipboard

Tracing Slowness Downloading Cached Outputs

Open cjohnstoniv opened this issue 3 years ago • 1 comments

Is there any stats/way to enable stats to output the time per-build-output-file it took to download? We see crazy ranges in times when running with a remote executor with all builds/tests coming from the cache. We can get as little as 20 seconds or as much as 8 minutes to download everything. Was hoping to see if it was like a few files or something larger, we've checked our network usage and it doesn't seem to be a bandwidth/network latency issue.

cjohnstoniv avatar May 25 '22 16:05 cjohnstoniv

For the client side, look at --experimental_remote_grpc_log. That will dump individual grpc read latencies.
(You will need to use tools_remote to parse the data. We are adding json support to make it easier).

There is an opportunity to improve visibility for this on the server side. The closest metric I see related to this is io_bytes which I don't think will help diagnose latency issues. Most of the time we look at system metrics from the deployment environment.

luxe avatar May 25 '22 16:05 luxe