Lost inputs no longer available remotely
Description of the bug:
Testing a new commit of bazel it seems that every build results in:
ERROR: /.../BUILD.bazel:38:10: Testing //...test failed: Lost inputs no longer available remotely: external/+llvm_configure+llvm-project/llvm/FileCheck (187249f0c3de1dd6dcd0f21e973db5150fd91fcf2ecc4f12d049300568b24c08/3075112), external/+llvm_configure+llvm-project/llvm/not (fb014eac066ab3da3e5afe4dad94abd7b52ded4a40d2add2347a178b968ac2dd/2586928)
This is with remote caching, remote exec, disk cache, and dynamic execution enabled. Bazel does autoretry this case but it seems to hit it consistently enough that it fails after the retries as well.
My previously good baseline was 2f67489da598c37d45c9b2b8601740d6e5a34851
Which category does this issue belong to?
No response
What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
No response
Which operating system are you running Bazel on?
No response
What is the output of bazel info release?
db2f824461bf326ad545676f3fdf32905dc8b26a
If bazel info release returns development version or (@non-git), tell us how you built Bazel.
No response
What's the output of git remote get-url origin; git rev-parse HEAD ?
If this is a regression, please try to identify the Bazel commit where the bug was introduced with bazelisk --bisect.
No response
Have you found anything relevant by searching the web?
No response
Any other information, logs, or outputs that you want to share?
No response
i also saw:
ERROR: io.reactivex.rxjava3.exceptions.UndeliverableException: The exception could not be delivered to the consumer because it has already canceled/disposed the flow or the exception has nowhere to go to begin with. Further reading: https://github.com/ReactiveX/RxJava/wiki/What's-different-in-2.0#error-handling | io.grpc.StatusRuntimeException: NOT_FOUND: ActionResult (hash:"1bf94372398c38a6b4150d65de4ad7f3fdcc9bdc43c64bfab67ff39bce3b68f3" size_bytes:337) not found: rpc error: code = NotFound desc = Exhausted all peers attempting to read "1bf94372398c38a6b4150d65de4ad7f3fdcc9bdc43c64bfab67ff39bce3b68f3".
at io.reactivex.rxjava3.plugins.RxJavaPlugins.onError(RxJavaPlugins.java:372)
at io.reactivex.rxjava3.internal.operators.single.SingleCreate$Emitter.onError(SingleCreate.java:82)
at com.google.devtools.build.lib.remote.util.RxFutures$OnceSingleOnSubscribe$1.onFailure(RxFutures.java:172)
at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1125)
at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1004)
at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:767)
at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:516)
at io.grpc.stub.ClientCalls$GrpcFuture.setException(ClientCalls.java:651)
at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:621)
at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
at com.google.devtools.build.lib.remote.NetworkTimeInterceptor$NetworkTimeCall$1.onClose(NetworkTimeInterceptor.java:81)
at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:565)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:733)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:714)
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)
Caused by: io.grpc.StatusRuntimeException: NOT_FOUND: ActionResult (hash:"1bf94372398c38a6b4150d65de4ad7f3fdcc9bdc43c64bfab67ff39bce3b68f3" size_bytes:337) not found: rpc error: code = NotFound desc = Exhausted all peers attempting to read "1bf94372398c38a6b4150d65de4ad7f3fdcc9bdc43c64bfab67ff39bce3b68f3".
at io.grpc.Status.asRuntimeException(Status.java:532)
... 13 more
maybe this was all transient, i'm having trouble reproducing again even after clean expunges, server restarts, and changes that should invalidate these artifacts
some notes from others helping debug offline, with the disk cache my local disk cache was 109gb, while passing --experimental_disk_cache_gc_max_size=100G, so it's possible this was about the disk cache evictions not remote cache evictions
FYI @tjgq I'll look into this, it may be caused by the persistent subtree cache in the MTC.