bazel
bazel copied to clipboard
5.0rc1: Remote cache read failure "java.lang.RuntimeException: java.io.IOException: file.tmp (Not a directory)"
Description of the problem / feature request:
We encountered an error where bazel fails to read from the remote cache because it throws an execution locally. Seems to be from this code path: https://github.com/bazelbuild/bazel/commit/cff2ea55dafc57d7d2db813522942e2ee137ed08#diff-c0c72bba110eca25447686392b1f265feabe77c8a53ed7387f70f82bcdaba441R112.
WARNING: Reading from Remote Cache:
--
| com.google.devtools.build.lib.remote.common.BulkTransferException: io.grpc.StatusRuntimeException: CANCELLED: Failed to read message.
| at com.google.devtools.build.lib.remote.RemoteCache.waitForBulkTransfer(RemoteCache.java:195)
| at com.google.devtools.build.lib.remote.RemoteExecutionService.downloadOutputs(RemoteExecutionService.java:971)
| at com.google.devtools.build.lib.remote.RemoteSpawnCache.lookup(RemoteSpawnCache.java:124)
| at com.google.devtools.build.lib.exec.AbstractSpawnStrategy.exec(AbstractSpawnStrategy.java:141)
| at com.google.devtools.build.lib.exec.AbstractSpawnStrategy.exec(AbstractSpawnStrategy.java:108)
| at com.google.devtools.build.lib.actions.SpawnStrategy.beginExecution(SpawnStrategy.java:47)
| at com.google.devtools.build.lib.exec.SpawnStrategyResolver.beginExecution(SpawnStrategyResolver.java:68)
| at com.google.devtools.build.lib.analysis.actions.SpawnAction.beginExecution(SpawnAction.java:328)
| at com.google.devtools.build.lib.actions.Action.execute(Action.java:134)
| at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$5.execute(SkyframeActionExecutor.java:909)
| at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionRunner.continueAction(SkyframeActionExecutor.java:1078)
| at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionRunner.run(SkyframeActionExecutor.java:1033)
| at com.google.devtools.build.lib.skyframe.ActionExecutionState.runStateMachine(ActionExecutionState.java:152)
| at com.google.devtools.build.lib.skyframe.ActionExecutionState.getResultOrDependOnFuture(ActionExecutionState.java:91)
| at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor.executeAction(SkyframeActionExecutor.java:496)
| at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.checkCacheAndExecuteIfNeeded(ActionExecutionFunction.java:856)
| at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.computeInternal(ActionExecutionFunction.java:349)
| at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.compute(ActionExecutionFunction.java:169)
| at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:590)
| at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:382)
| at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
| at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
| at java.base/java.lang.Thread.run(Unknown Source)
| Suppressed: java.io.IOException: io.grpc.StatusRuntimeException: CANCELLED: Failed to read message.
| at com.google.devtools.build.lib.remote.GrpcCacheClient.lambda$downloadBlob$13(GrpcCacheClient.java:320)
| at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.doFallback(AbstractCatchingFuture.java:192)
| at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.doFallback(AbstractCatchingFuture.java:179)
| at com.google.common.util.concurrent.AbstractCatchingFuture.run(AbstractCatchingFuture.java:124)
| at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
| at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1213)
| at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:983)
| at com.google.common.util.concurrent.AbstractFuture.setFuture(AbstractFuture.java:814)
| at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.setResult(AbstractCatchingFuture.java:203)
| at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.setResult(AbstractCatchingFuture.java:179)
| at com.google.common.util.concurrent.AbstractCatchingFuture.run(AbstractCatchingFuture.java:133)
| at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
| at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1213)
| at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:983)
| at com.google.common.util.concurrent.AbstractFuture.setFuture(AbstractFuture.java:814)
| at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.setResult(AbstractCatchingFuture.java:203)
| at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.setResult(AbstractCatchingFuture.java:179)
| at com.google.common.util.concurrent.AbstractCatchingFuture.run(AbstractCatchingFuture.java:133)
| at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
| at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1213)
| at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:983)
| at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:771)
| at com.google.common.util.concurrent.SettableFuture.setException(SettableFuture.java:53)
| at com.google.devtools.build.lib.remote.GrpcCacheClient$1.onError(GrpcCacheClient.java:376)
| at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:478)
| at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
| at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
| at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
| at com.google.devtools.build.lib.remote.NetworkTimeInterceptor$NetworkTimeCall$1.onClose(NetworkTimeInterceptor.java:81)
| at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
| at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
| at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
| at com.google.devtools.build.lib.remote.ReferenceCountedChannel$ConnectionCleanupCall$1.onClose(ReferenceCountedChannel.java:90)
| at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
| at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
| at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
| at com.google.devtools.build.lib.remote.logging.LoggingInterceptor$LoggingForwardingCall$1.onClose(LoggingInterceptor.java:157)
| at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:557)
| at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:69)
| at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:738)
| at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:717)
| at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
| at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
| ... 3 more
| Suppressed: java.io.IOException: /private/var/tmp/_bazel_iosci/7c81b0540945e51fddc7bd01ee8d0abb/execroot/lyftios/bazel-out/darwin-fastbuild/bin/tools/CodeAnalysis/Tests/LinuxMain.swift.tmp (Not a directory)
| at com.google.devtools.build.lib.unix.NativePosixFiles.openWrite(Native Method)
| at com.google.devtools.build.lib.unix.UnixFileSystem.createFileOutputStream(UnixFileSystem.java:493)
| at com.google.devtools.build.lib.vfs.AbstractFileSystem.getOutputStream(AbstractFileSystem.java:174)
| at com.google.devtools.build.lib.vfs.AbstractFileSystem.getOutputStream(AbstractFileSystem.java:188)
| at com.google.devtools.build.lib.vfs.Path.getOutputStream(Path.java:425)
| at com.google.devtools.build.lib.vfs.Path.getOutputStream(Path.java:413)
| at com.google.devtools.build.lib.remote.common.LazyFileOutputStream.ensureOpen(LazyFileOutputStream.java:66)
| at com.google.devtools.build.lib.remote.common.LazyFileOutputStream.close(LazyFileOutputStream.java:60)
| at com.google.devtools.build.lib.remote.RemoteCache$ReportingOutputStream.close(RemoteCache.java:527)
| at com.google.devtools.build.lib.remote.RemoteCache$3.onFailure(RemoteCache.java:400)
| at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1066)
| at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
| at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1213)
| at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:983)
| at com.google.common.util.concurrent.AbstractFuture.setFuture(AbstractFuture.java:814)
| at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.setResult(AbstractCatchingFuture.java:203)
| at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.setResult(AbstractCatchingFuture.java:179)
| at com.google.common.util.concurrent.AbstractCatchingFuture.run(AbstractCatchingFuture.java:133)
| ... 42 more
| Caused by: io.grpc.StatusRuntimeException: CANCELLED: Failed to read message.
| at io.grpc.Status.asRuntimeException(Status.java:535)
| ... 22 more
| Caused by: java.lang.RuntimeException: java.io.IOException: /private/var/tmp/_bazel_iosci/7c81b0540945e51fddc7bd01ee8d0abb/execroot/lyftios/bazel-out/darwin-fastbuild/bin/tools/CodeAnalysis/Tests/LinuxMain.swift.tmp (Not a directory)
| at com.google.devtools.build.lib.remote.GrpcCacheClient$1.onNext(GrpcCacheClient.java:356)
| at com.google.devtools.build.lib.remote.GrpcCacheClient$1.onNext(GrpcCacheClient.java:347)
| at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onMessage(ClientCalls.java:465)
| at io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33)
| at io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33)
| at io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33)
| at com.google.devtools.build.lib.remote.logging.LoggingInterceptor$LoggingForwardingCall$1.onMessage(LoggingInterceptor.java:138)
| at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInternal(ClientCallImpl.java:656)
| at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInContext(ClientCallImpl.java:641)
| ... 5 more
| Caused by: java.io.IOException: /private/var/tmp/_bazel_iosci/7c81b0540945e51fddc7bd01ee8d0abb/execroot/lyftios/bazel-out/darwin-fastbuild/bin/tools/CodeAnalysis/Tests/LinuxMain.swift.tmp (Not a directory)
| at com.google.devtools.build.lib.unix.NativePosixFiles.openWrite(Native Method)
| at com.google.devtools.build.lib.unix.UnixFileSystem.createFileOutputStream(UnixFileSystem.java:493)
| at com.google.devtools.build.lib.vfs.AbstractFileSystem.getOutputStream(AbstractFileSystem.java:174)
| at com.google.devtools.build.lib.vfs.AbstractFileSystem.getOutputStream(AbstractFileSystem.java:188)
| at com.google.devtools.build.lib.vfs.Path.getOutputStream(Path.java:425)
| at com.google.devtools.build.lib.vfs.Path.getOutputStream(Path.java:413)
| at com.google.devtools.build.lib.remote.common.LazyFileOutputStream.ensureOpen(LazyFileOutputStream.java:66)
| at com.google.devtools.build.lib.remote.common.LazyFileOutputStream.write(LazyFileOutputStream.java:42)
| at com.google.devtools.build.lib.remote.RemoteCache$ReportingOutputStream.write(RemoteCache.java:510)
| at com.google.devtools.build.lib.remote.util.DigestOutputStream.write(DigestOutputStream.java:58)
| at java.base/java.io.FilterOutputStream.write(Unknown Source)
| at com.google.protobuf.ByteString$LiteralByteString.writeTo(ByteString.java:1381)
| at com.google.devtools.build.lib.remote.GrpcCacheClient$1.onNext(GrpcCacheClient.java:352)
| ... 13 more
LinuxMain.swift
is single file output from a genrule
.
Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
I don't know of one.
What operating system are you running Bazel on?
macOS 11.6.1
What's the output of bazel info release
?
release 5.0.0rc1
According to the stack trace:
java.io.IOException: /private/var/tmp/_bazel_iosci/7c81b0540945e51fddc7bd01ee8d0abb/execroot/lyftios/bazel-out/darwin-fastbuild/bin/tools/CodeAnalysis/Tests/LinuxMain.swift.tmp (Not a directory)
| at com.google.devtools.build.lib.unix.NativePosixFiles.openWrite(Native Method)
| at com.google.devtools.build.lib.unix.UnixFileSystem.createFileOutputStream(UnixFileSystem.java:493)
| at com.google.devtools.build.lib.vfs.AbstractFileSystem.getOutputStream(AbstractFileSystem.java:174)
| at com.google.devtools.build.lib.vfs.AbstractFileSystem.getOutputStream(AbstractFileSystem.java:188)
| at com.google.devtools.build.lib.vfs.Path.getOutputStream(Path.java:425)
| at com.google.devtools.build.lib.vfs.Path.getOutputStream(Path.java:413)
| at com.google.devtools.build.lib.remote.common.LazyFileOutputStream.ensureOpen(LazyFileOutputStream.java:66)
| at com.google.devtools.build.lib.remote.common.LazyFileOutputStream.write(LazyFileOutputStream.java:42)
| at com.google.devtools.build.lib.remote.RemoteCache$ReportingOutputStream.write(RemoteCache.java:510)
| at com.google.devtools.build.lib.remote.util.DigestOutputStream.write(DigestOutputStream.java:58)
| at java.base/java.io.FilterOutputStream.write(Unknown Source)
| at com.google.protobuf.ByteString$LiteralByteString.writeTo(ByteString.java:1381)
| at com.google.devtools.build.lib.remote.GrpcCacheClient$1.onNext(GrpcCacheClient.java:352)
| ... 13 more
it calls into NativePosixFiles.openWrite() which calls open
which throws "Not a directory" or ENOTDIR
.
According to the doc, ENOTDIR
means "A component of the path prefix is not a directory". So I am guessing the issue is the output directory is not prepared probably for the action.
Might related https://github.com/bazelbuild/bazel/issues/6393.
@brentleyjones Is that possible to create a repro?
It's weird, the parent directories are actually created before the download.
It was on CI. All I can think of is the directories failed to be created, because of low disk space.
We're seeing what I think is the same issue with bazel 5.1.1:
[1,147 / 1,149] Compiling test/extensions/network/dns_resolver/apple/apple_dns_impl_test.cc; 1s remote-cache
ERROR: /Users/runner/work/1/s/test/extensions/network/dns_resolver/apple/BUILD:11:14: Compiling test/extensions/network/dns_resolver/apple/apple_dns_impl_test.cc failed: Exec failed due to IOException: 41 errors during bulk transfer:
java.io.IOException: io.grpc.StatusRuntimeException: CANCELLED: Failed to read message.
java.io.IOException: io.grpc.StatusRuntimeException: CANCELLED: Failed to read message.
Unfortunately I think we're hiding the full stack trace https://dev.azure.com/cncf/envoy/_build/results?buildId=106443&view=logs&jobId=fa3d3e18-6969-5713-c3e7-3581195704fd&j=fa3d3e18-6969-5713-c3e7-3581195704fd&t=ba370bef-ca49-52f5-0c93-7c0c0f27c465 (this log will probably rotate)
Hit this bug today when trying to use
build --strategy=GoLink=linux-sandbox,remote
build --strategy=GoCompile=linux-sandbox,remote
so that I could do GoLinking locally while doing the rest of my build on RBE.
Not sure if that information is helpful
A lot of things have changed since 5. Please reopen if it's still happening.