remote-apis icon indicating copy to clipboard operation
remote-apis copied to clipboard

GetActionResult non-error cache miss response

Open werkt opened this issue 6 years ago • 3 comments

The error classification response of NOT_FOUND for GetActionResult cache misses means that some response observers, including those of the circuit breaker implementation available but not currently in use by bazel, cannot use success vs. failure as a signal for the reliability of remote availability. Since these requests connote a completely successful round trip through the service to serve an application-level meaningful response, to be lumped in with all other errors, some of which can indicate any level of failure along the communication hierarchy, is an incorrect presentation of the nature of the cache miss.

I suggest that the response should be wrapped in a GetActionResultResponse, with the present (hasActionResult() == false) interpretation via protobuf available as the proper means of determining a cache miss, rather than the RESTful error response of NOT_FOUND.

werkt avatar Feb 19 '19 15:02 werkt

FWIW, if we change this we should also change GetTree. I'll note for posterity that Execute may also return DEADLINE_EXCEEDED for "the execution is still going, but this RPC needs to be closed and re-established", but I don't think that one needs to change.

My 2c is I'm generally ambivalent - I do like the semantic change of not using RPC-level codes for expected responses, but it is a (minor) departure from RESTful semantics. Though then again, we don't have a CreateActionResult API either, which seems to imply that no creation is needed, fitting the idea that everything can be gotten? So maybe it's not really more of a departure from expected semantics than present, in which case I'm supportive.

EricBurnett avatar Feb 19 '19 15:02 EricBurnett

I agree, the only reason we did it this way is to be more RESTful, and it requires annoying exceptions on many levels of coding. My only nit is that with the new proposal there is a big semantic difference between a missing field value and a default field value, which sounds a bit dangerous to me (someone not using hasResult and just going for getResult might think that the action had a 0 exit code and no outputs). It also depends on the language. So maybe add a boolean cache_miss just in case, so that the default path doesn't waste space?

ola-rozenfeld avatar Feb 19 '19 16:02 ola-rozenfeld

I should have added that my chosen method of interpretation is predicated on the guaranteed differentiation from the default case. If that's not guaranteeable across protobuf implementations, then the additional field is justified.

werkt avatar Feb 19 '19 16:02 werkt