bazel-buildfarm icon indicating copy to clipboard operation
bazel-buildfarm copied to clipboard

Exception thrown when running worker on another machine

Open aghoussaini opened this issue 3 years ago • 1 comments

Description of the problem: I ran the latest version of the Buildfarm server locally on my machine and launched a worker on another machine. Both machines use Windows. The following error gets displayed on the worker machine:

Jul 27, 2022 11:22:48 AM build.buildfarm.worker.PipelineStage run
SEVERE: MatchStage::run(): stage terminated due to exception
io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
    at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:262)
    at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:243)
    at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:156)
    at build.buildfarm.v1test.OperationQueueGrpc$OperationQueueBlockingStub.take(OperationQueueGrpc.java:316)
    at build.buildfarm.instance.stub.StubInstance.match(StubInstance.java:724)
    at build.buildfarm.worker.memory.OperationQueueClient.match(OperationQueueClient.java:119)
    at build.buildfarm.worker.memory.OperationQueueWorkerContext.match(OperationQueueWorkerContext.java:176)
    at build.buildfarm.worker.MatchStage.iterate(MatchStage.java:142)
    at build.buildfarm.worker.PipelineStage.runInterruptible(PipelineStage.java:44)
    at build.buildfarm.worker.PipelineStage.run(PipelineStage.java:51)
    at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection timed out: no further information: /some_ip_address
Caused by: java.net.ConnectException: Connection timed out: no further information
    at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779)
    at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)
    at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:707)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
    at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
    at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
    at java.base/java.lang.Thread.run(Thread.java:834)

Jul 27, 2022 11:22:50 AM build.buildfarm.worker.Pipeline join
INFO: Interrupting unterminated closed thread in stage InputFetchStage at priority 3
Disconnected from the target VM, address: 'localhost:5005', transport: 'socket'
Jul 27, 2022 11:22:52 AM build.buildfarm.worker.Pipeline join
INFO: Interrupting unterminated closed thread in stage ReportResultStage at priority 1
Jul 27, 2022 11:22:52 AM build.buildfarm.worker.memory.Worker stop
INFO: Closing the pipeline

At first, I suspected perhaps that my custom implementation of a Buildfarm client might be causing a problem. But then, I decided to run a simple Hello World program remotely using Bazel, and the error persisted. The error only appears when I try running a worker on another machine.

aghoussaini avatar Jul 27 '22 08:07 aghoussaini

Your server is not available for connection - the name/ip of the server specified in config needs to be resolvable/routable, and the port needs to be correctly matched between your server and worker config targeting it.

werkt avatar Aug 11 '22 14:08 werkt