remote-apis-sdks
remote-apis-sdks copied to clipboard
Implement the cache miss retry loop
This refers to the case where Execute returns a NOT_FOUND because of a cache miss. This is not supposed to happen often (in fact, ongoing RBE work is under way to make sure that does not happen at all), but it can happen with other backends. In this case, we need to re-upload missing inputs and retry the whole flow.
client: android rbe 0.57.0 server: buildfarm 2.4.0 I'm using android rbe 0.57.0 as a client,I'm guess from the log error that it may have used remote-apis-sdks's commit around March 1, 2022. I had a similar problem,a pod in the workers is faulty,so i have to delete the pod and re-create it with empty cache dir. But ContentAddressableStorage and ActionCache data in redis aren't delete. Then when android rbe client is stuck when it use buildfarm server to remote build, and client error log is:
cas.go:1399] Error downloading {blob file hash}/{blob file size}: rpc error: code = NotFound desc = No workers found.
cas.go:1408] Internal tool error - matching map entry
I found that this could be caused by not considering that the download interface is NOT_FOUND status.