bazel-buildfarm
bazel-buildfarm copied to clipboard
Enable prometheus logging for buildfarm:worker read on client cancel
Problem
Currently we don't see any metrics with status cancelled for buildfarm:worker read. This issue arises because the onCancelHandler does not invoke the responseObserver.onError method. As a result, the reportEndMetrics function, responsible for logging prometheus metrics, is not triggered.
Solution
Added code to ensure that the responseObserver.onError method is called when the client initiates a cancellation request. Additionally, added a check in the ServerStreamCallObserver implementation to avoid the call already closed error, which could occur if the buildfarm:worker has already called onComplete or onError (realistically onComplete) by the time the cancellation request from the client (buildfarm:server) arrives.
@werkt Can you please take a look. This is minor update with no functionality change.
This is not a change I'm interested in making. You're accommodating an interceptor for its behavior by changing the usage, inducing a registration with a loop back through an already cancelled handle, and the affected domain for this is the single getBlob, ignoring all the other calls where this takes place.
This is a problem with the prometheus interceptor, I recommend you make the change there - it should handle cancels by presenting metrics.
Thanks for feedback, will make the required changes.