remote-apis-sdks icon indicating copy to clipboard operation
remote-apis-sdks copied to clipboard

Implement the cache miss retry loop

Open ola-rozenfeld opened this issue 5 years ago • 1 comments

This refers to the case where Execute returns a NOT_FOUND because of a cache miss. This is not supposed to happen often (in fact, ongoing RBE work is under way to make sure that does not happen at all), but it can happen with other backends. In this case, we need to re-upload missing inputs and retry the whole flow.

ola-rozenfeld avatar Jul 15 '19 13:07 ola-rozenfeld

client: android rbe 0.57.0 server: buildfarm 2.4.0 I'm using android rbe 0.57.0 as a client,I'm guess from the log error that it may have used remote-apis-sdks's commit around March 1, 2022. I had a similar problem,a pod in the workers is faulty,so i have to delete the pod and re-create it with empty cache dir. But ContentAddressableStorage and ActionCache data in redis aren't delete. Then when android rbe client is stuck when it use buildfarm server to remote build, and client error log is:

cas.go:1399] Error downloading {blob file hash}/{blob file size}: rpc error: code = NotFound desc = No workers found.
cas.go:1408] Internal tool error - matching map entry

I found that this could be caused by not considering that the download interface is NOT_FOUND status.

DarkMatterV avatar Oct 18 '23 02:10 DarkMatterV