bazel-buildfarm icon indicating copy to clipboard operation
bazel-buildfarm copied to clipboard

Enable prometheus logging for buildfarm:worker read on client cancel

Open amishra-u opened this issue 2 years ago • 2 comments

Problem

Currently we don't see any metrics with status cancelled for buildfarm:worker read. This issue arises because the onCancelHandler does not invoke the responseObserver.onError method. As a result, the reportEndMetrics function, responsible for logging prometheus metrics, is not triggered.

Solution

Added code to ensure that the responseObserver.onError method is called when the client initiates a cancellation request. Additionally, added a check in the ServerStreamCallObserver implementation to avoid the call already closed error, which could occur if the buildfarm:worker has already called onComplete or onError (realistically onComplete) by the time the cancellation request from the client (buildfarm:server) arrives.

amishra-u avatar Aug 06 '23 01:08 amishra-u

@werkt Can you please take a look. This is minor update with no functionality change.

amishra-u avatar Aug 07 '23 17:08 amishra-u

This is not a change I'm interested in making. You're accommodating an interceptor for its behavior by changing the usage, inducing a registration with a loop back through an already cancelled handle, and the affected domain for this is the single getBlob, ignoring all the other calls where this takes place.

This is a problem with the prometheus interceptor, I recommend you make the change there - it should handle cancels by presenting metrics.

Thanks for feedback, will make the required changes.

amishra-u avatar Aug 10 '23 07:08 amishra-u