grpc-java Thick client side load balancing without using load balancer

What is the best practice to use client-side load balancing? I read through this thread - https://github.com/grpc/grpc-java/issues/428 and various options provided. I found one example - https://github.com/grpc/grpc-java/blob/master/examples/src/main/java/io/grpc/examples/customloadbalance/CustomLoadBalanceClient.java

But don't think its dynamically updating the endpoints of the servers based on scale up/down.

Trying to follow - https://github.com/sercand/kuberesolver where they are using Kubernetes API to watch for IPs and updating the endpoints to do round-robin load balancing and making server as headless service.

Found other blogs having different approaches- https://medium.com/jamf-engineering/how-three-lines-of-configuration-solved-our-grpc-scaling-issues-in-kubernetes-ca1ff13f7f06

But if I close the gRPC connection I do not benefit from the long-lived connection provided by gRPC.

Let me know your thoughts, what is the preferred approach for client-side load balancing?

May 06 '24 06:05 archit-harness

For k8s assuming you want L7 load balancing, it is normal to use a headless service with the round_robin load balancer. That can be done by calling channelBuilder.defaultLoadBalancingPolicy("round_robin").

When using DNS to resolve addresses, yes, you will want to configure serverBuilder.maxConnectionAge() on your server to occasionally cycle connections so that the client re-resolves DNS addresses. I'd suggest using an age of some number of minutes.

(That approach would work well for L4 load balancing as well, but you'd use pick_first with shuffleAddressList enabled.)

As a slightly more advanced alternative, using the k8s watch API can work well, and with that approach you don't need to use max connection age. We don't have a built-in implementation of the k8s watch, but there are examples floating around. The NameResolver API is experimental and we know we will change it in some ways before marking it stable, but such changes we work to make easy to absorb, as we know people are using the API.

But if I close the gRPC connection I do not benefit from the long-lived connection provided by gRPC.

This isn't really a problem in practice. The amortized cost of the connection is pretty low, as long as you don't get very aggressive on the max connection age.

May 09 '24 04:05 ejona86

Thanks for the recommendation, we are trying this out and update here.

May 10 '24 03:05 archit-harness

Seems like this is resolved. If you end up having trouble, comment, and it can be reopened.

May 16 '24 16:05 ejona86

@ejona86 one more thing was to check with you, when we are using round robin and headless service, the client will refer to server as DNS:///headless-service: or just headless-service: ?

May 17 '24 05:05 archit-harness

@archit-harness, those are generally equivalent. gRPC detects headless-service: is incomplete and prefixes it with "dns:///" (the default for most systems). The "canonical" form is "dns:///headless-service" (with or without port). No scheme prefix is a short form.

(In the olden days we only supported host:port, but when we added name resolver support which used the scheme we tried to detect if it was old-form and convert it into new-form. But the old-form is also useful as a shorthand.)

May 17 '24 14:05 ejona86

@ejona86 as per this blog - https://itnext.io/grpc-name-resolution-load-balancing-everything-you-need-to-know-and-probably-a-bit-more-77fc0ae9cd6c It says the default scheme used is passthrough, The definition - Passthrough (default): Just returns the target provided by the ClientConn without any specific logic.

So i didn't get as its specifically mentioned DNS separately, so wanted to confirm will both return same results?

May 17 '24 14:05 archit-harness

That is grpc-go-specific. Go (against my recommendations) didn't use the new form (there were some technical issues, but in my mind they had easy solutions). Although Go now does do what I mentioned if using grpc.NewClient() instead of grpc.Dial() (deprecated). That is a very recent development.

May 17 '24 14:05 ejona86

@ejona86 thanks for the clarification, so without DNS prefix also it will resolve the same. 👍

May 18 '24 03:05 archit-harness

Hi @ejona86 we implemented the changes and it looks load balancing is working fine. But as mentioned here - https://medium.com/jamf-engineering/how-three-lines-of-configuration-solved-our-grpc-scaling-issues-in-kubernetes-ca1ff13f7f06

We are facing grpc UNAVAILABLE error during rolling updates.

But i don't get one thing - we have preStop hooks on our pods, which ensures pod will be live for 60 sec at least. and as per blog, DNS caches refresh after 30 sec. We are still seeing those errors. not sure how minReadySeconds will help to mitigate the issue, as i understand the issue happens if DNS returns IP of old pod which is died down, which wont happen for our case if time period is 30sec.

Is there any case which i am missing?

Also errors are of different types - a) Caused by: io.grpc.netty.shaded.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: Connection refused: pipeline-service-headless/<IP>: Caused by: java.net.ConnectException: finishConnect(..) failed: Connection refused

b) Exception occurred: StatusRuntimeException: UNAVAILABLE: io exception; Cause: NativeIoException: recvAddress(..) failed: Connection timed out

Jun 28 '24 18:06 archit-harness

I had given a few options. Are you using pick-first or round-robin?

"Connection timed out". This means you aren't likely doing a graceful shutdown. Normal processing for PreStop hook would be to do something like:

// Stop accepting new RPCs and new connections
server.shutdown();
// Wait for already-created RPCs to complete. Returns as soon all RPCs complete
server.awaitTermination(30, TimeUnit.SECONDS);
// Now kill any outstanding RPCs to allow the server to shut down
server.shutdownNow();
// And give it a little bit of time to process. This should return pretty quickly
server.awaitTermination(5, TimeUnit.SECONDS);

The client will start reconnecting as soon as server.shutdown() is called.

In that medium post when it uses .spec.minReadySeconds = 30, the new pod comes up and clients will already start receiving the new DNS results when the old pod begins its shut down. There's a race between resolving new DNS results and reconnecting; reconnecting might first use the old DNS addresses (and then try again once the updated addresses are known). That medium post may be avoiding that issue since MAX_CONNECTION_AGE=30s matches minReadySeconds, so when an old pod begins shutting down all clients already have the new pod's IP address.

Jul 02 '24 19:07 ejona86

Hi @ejona86 i am using round robin.

So, i will try shutting down the server, which i am not doing. Could be potential reason for it.

Also, what do you think is the recommended approach. Do you think we should use minReadySeconds vs using correct proper shutdown of RPCs.

Also, should we look at something else to add to handle any other edge cases?

Jul 03 '24 04:07 archit-harness

Wrote up some tests... right now, it SEEMS like if retry is set, UNAVAILABLE responses retry against a different host when using load balancing. However, if we don't have retry set (and I use an intentional UNAVAILABLE status code on the server), the requests fail. IF we set a default retry policy to retry on UNAVAILABLE, I'm seeing via tests that the load balancing correctly re-route and tried on a different host. It's been hard to simulate a server disconnect in an integration test without some magic but I've KINDA tested the situation). I've got a sever that sets

        responseObserver.onError(Status.UNAVAILABLE.withDescription("We are a failing server").asRuntimeException());

When "told to fail" as part of the response processor to try to simulate the "UNAVAILABLE" state. It's not as good as an iptables rule to drop the request silently but... ;)

I'm wondering if there's something missing on the retry where a bad server stays in the provider list somehow. Logs where we've seen this in "real life" before the retry (and stack trace):

io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
	at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:268)
	at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:249)
	at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:167)
	at io.harness.pms.contracts.service.OutcomeProtoServiceGrpc$OutcomeProtoServiceBlockingStub.resolveOptional(OutcomeProtoServiceGrpc.java)
....
Caused by: io.grpc.netty.shaded.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: Connection refused: internal-service-headless/10.36.29.215:12011
Caused by: java.net.ConnectException: finishConnect(..) failed: Connection refused
	at io.grpc.netty.shaded.io.netty.channel.unix.Errors.newConnectException0(Errors.java:166)
	at io.grpc.netty.shaded.io.netty.channel.unix.Errors.handleConnectErrno(Errors.java:131)
	at io.grpc.netty.shaded.io.netty.channel.unix.Socket.finishConnect(Socket.java:359)
	at io.grpc.netty.shaded.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.doFinishConnect(AbstractEpollChannel.java:710)
	at io.grpc.netty.shaded.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:687)
	at io.grpc.netty.shaded.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:567)
	at io.grpc.netty.shaded.io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:499)
	at io.grpc.netty.shaded.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:407)
	at io.grpc.netty.shaded.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
	at io.grpc.netty.shaded.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.grpc.netty.shaded.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Thread.java:833)

This call is repeated 3 times using net.jodah.failsafe Retry handler and all three requests land on the same "bad node" aka that's being shutdown. I DO know the host disappeared from access logs shortly after we saw these - but we THOUGHT that the DnsProvider, with the default cache TTL in GRPC of 30 seconds, and some of the default transparent retries on these kind of situations would handle these failures & reroute. It doesn't seem like it did work this way. I'm wondering if the health checking load balancer would handle this differently/better? OR if there's a config missing or something else going on. FYI looks like version 1.59 of the grpc-java libraries, not seen much in changelog or other notes.

Jul 03 '24 06:07 jasonmcintosh

Do you think we should use minReadySeconds vs using correct proper shutdown of RPCs.

Definitely have graceful shutdown. Anything else would be in addition to that. If you see "Connection refused" errors, with round-robin that means the client didn't receive the new pod IPs. We expect gRPC to see some connection refused errors, but we don't expect those to cause RPCs to fail; gRPC will use other addresses instead. If you RPCs failing with connection refused that means all addresses aren't working, which likely means all the addresses are old. minReadySeconds of 30 seconds could indeed help in that situation. How much you need minReadySeconds is a function of how many server pods you have, with fewer pods benefiting from it more.

Also, should we look at something else to add to handle any other edge cases?

Use MAX_CONNECTION_AGE if you aren't already. That will help with scale-up events.

Jul 03 '24 14:07 ejona86

SO did some more testing. I'll try to share the test code where we startup multiple GRPC servers & run some tests. Turns out a few quirks:

LoadBalancerConfig vs. loadBalancerConfig on the map passed to the service connection is KEY :) that got msised. I finally copied the JSON directly and used the Gson converter to set a valid service config. That got tests to pass.
I MAY have identified the "culprit". IF we pass what is a junk map b/c of that case difference, we see the failures. Specifically:

    return NettyChannelBuilder.forTarget(getTarget())
        .overrideAuthority(computeAuthority(getAuthority()))
        .usePlaintext()
        .defaultServiceConfig(serviceConfig)
        .defaultLoadBalancingPolicy(loadBalancingConfig)
        .maxInboundMessageSize(GRPC_MAXIMUM_MESSAGE_SIZE);

The load balancing policy getting set to round_robin seems to turn on load balancing, but that in combination with a blank service config we'd see"UNAVAILABLE" set of exceptions on a server shutdown.

Jul 03 '24 14:07 jasonmcintosh

    "methodConfig" : [{
      "name" : [{}],
      "waitForReady" : true,
      "retryPolicy" : {
        "maxAttempts" : 5,
        "initialBackoff" : "0.1s",
        "maxBackoff" : "1s",
        "backoffMultiplier" : 2,
        "retryableStatusCodes" : ["UNAVAILABLE"]
      }
    }]

FYI tests ALSO showed that something like this service config seems to be required to handle the server unavailable state without failures.

Jul 03 '24 15:07 jasonmcintosh

Ok the retries help if the server EXPLICITLY returns that code but is still connected. I was trying to use the following to test a "bad server" state (OOM kinda thing or something else):

  private static class GrpcClientTestServer extends GrpcClientTestGrpc.GrpcClientTestImplBase {
    private final int serverNumber;
    boolean shouldFail = false;
    public GrpcClientTestServer(int serverNumber) {
      this.serverNumber = serverNumber;
    }

    @Override
    public void sayHello(GrpcClientRequest request, StreamObserver<GrpcClientResponse> responseObserver) {
      if (shouldFail) {
        responseObserver.onError(Status.UNAVAILABLE.withDescription("We are a failing server").asRuntimeException());
      }
      responseObserver.onNext(GrpcClientResponse.newBuilder()
                                  .setMessage("hello '" + request.getName() + "' from the server " + serverNumber + "!")
                                  .build());
      responseObserver.onCompleted();
    }
  }

I'm guessing this isn't the "right" wait to signal a "BAD" situation on the server side.

Jul 03 '24 15:07 jasonmcintosh

https://github.com/jasonmcintosh/bug-reports/blob/grpcJavaTests/grpc-java/load-balancing-tests/ I pushed up full code of some of the integration type tests running to test load balancing behavior, failure states, etc. to see how things were handled. I'm pretty sure I missed some basics on the failure handling here.

Jul 03 '24 16:07 jasonmcintosh

Looking - I've not found a good way to hook in a listener/metric on resolver resolution to see if DNS is the situation. Internally I may replicate the DnsResolutionProvider to inject a custom listeners on resolution, unless there's a cleaner way? Adding the concept of a "name resolver resolution listener" would help so we could inject log output or add metrics on a name resolution change. I'm pretty sure thee's SOMETHING on the config or DNS side but additional logs would help confirm this. Existing logs don't seem to log the addresses which would help confirm "we got multiple addresses" nor when a refresh happens. I will look at a PR to add some additional "FINEST" level logging on address resolution if this would help?

Jul 03 '24 18:07 jasonmcintosh

Hi @ejona86 i am using MAX_CONNECTION_AGE = 30, as already shared by you in start.

I confirmed, that getting UNAVAILABLE errors during pod stop only.

Also, we have shutdown hook on grpc servers, but those are implemented like below

Runtime.getRuntime().addShutdownHook(new Thread(() -> serviceManager.stopAsync().awaitStopped()));

So it might be happening in async and thus its not honoring? Do you recommend explicit waiting on thread to shutdown as shared by you - https://github.com/grpc/grpc-java/issues/11151#issuecomment-2204241930 ?

Jul 04 '24 11:07 archit-harness

One more observation would like to add, this is happening during scale down of pods as well, and exception is as below

io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
	at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:268)
	at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:249)
	at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:167)
	at io.harness.gitsync.HarnessToGitPushInfoServiceGrpc$HarnessToGitPushInfoServiceBlockingStub.getFile(HarnessToGitPushInfoServiceGrpc.java:986)
	at io.harness.gitsync.common.helper.GitSyncGrpcClientUtils.lambda$retryAndProcessExceptionV2$1(GitSyncGrpcClientUtils.java:44)
	at net.jodah.failsafe.Functions.lambda$resultSupplierOf$11(Functions.java:284)
	at net.jodah.failsafe.RetryPolicyExecutor.lambda$supply$0(RetryPolicyExecutor.java:63)
	at net.jodah.failsafe.Execution.executeSync(Execution.java:129)
	at net.jodah.failsafe.FailsafeExecutor.call(FailsafeExecutor.java:379)
	at net.jodah.failsafe.FailsafeExecutor.get(FailsafeExecutor.java:70)
	at io.harness.gitsync.common.helper.GitSyncGrpcClientUtils.retryAndProcessExceptionV2(GitSyncGrpcClientUtils.java:44)
	at io.harness.gitsync.scm.SCMGitSyncHelper.getFileByBranch(SCMGitSyncHelper.java:150)
	at io.harness.gitaware.helper.GitAwareEntityHelper.fetchEntityFromRemote(GitAwareEntityHelper.java:85)
	at io.harness.gitaware.helper.GitAwareEntityHelper$$FastClassBySpringCGLIB$$d8da3501.invoke(<generated>)
	at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
	at org.springframework.aop.framework.CglibAopProxy.invokeMethod(CglibAopProxy.java:386)
	at org.springframework.aop.framework.CglibAopProxy.access$000(CglibAopProxy.java:85)
	at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:704)
	at io.harness.gitaware.helper.GitAwareEntityHelper$$EnhancerBySpringCGLIB$$2e22fd54.fetchEntityFromRemote(<generated>)
	at io.harness.repositories.pipeline.PMSPipelineRepositoryCustomImpl.fetchRemoteEntity(PMSPipelineRepositoryCustomImpl.java:388)
	at io.harness.repositories.pipeline.PMSPipelineRepositoryCustomImpl.find(PMSPipelineRepositoryCustomImpl.java:321)
	at jdk.internal.reflect.GeneratedMethodAccessor564.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at org.springframework.data.repository.core.support.RepositoryMethodInvoker$RepositoryFragmentMethodInvoker.lambda$new$0(RepositoryMethodInvoker.java:289)
	at org.springframework.data.repository.core.support.RepositoryMethodInvoker.doInvoke(RepositoryMethodInvoker.java:137)
	at org.springframework.data.repository.core.support.RepositoryMethodInvoker.invoke(RepositoryMethodInvoker.java:121)
	at org.springframework.data.repository.core.support.RepositoryComposition$RepositoryFragments.invoke(RepositoryComposition.java:530)
	at org.springframework.data.repository.core.support.RepositoryComposition.invoke(RepositoryComposition.java:286)
	at org.springframework.data.repository.core.support.RepositoryFactorySupport$ImplementationMethodExecutionInterceptor.invoke(RepositoryFactorySupport.java:640)
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
	at org.springframework.data.repository.core.support.QueryExecutorMethodInterceptor.doInvoke(QueryExecutorMethodInterceptor.java:164)
	at org.springframework.data.repository.core.support.QueryExecutorMethodInterceptor.invoke(QueryExecutorMethodInterceptor.java:139)
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
	at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97)
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
	at io.opentelemetry.javaagent.instrumentation.spring.data.v1_8.SpringDataInstrumentationModule$RepositoryInterceptor.invoke(SpringDataInstrumentationModule.java:111)
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:220)
	at jdk.proxy3/jdk.proxy3.$Proxy275.find(Unknown Source)
	at io.harness.pms.pipeline.service.PMSPipelineServiceImpl.getPipeline(PMSPipelineServiceImpl.java:536)
	at io.harness.pms.pipeline.service.PMSPipelineServiceImpl.getAndValidatePipeline(PMSPipelineServiceImpl.java:446)
	at io.harness.pms.pipeline.service.PMSPipelineServiceImpl.getAndValidatePipeline(PMSPipelineServiceImpl.java:413)
	at io.harness.pms.pipeline.resource.PipelineResourceImpl.getPipelineByIdentifier(PipelineResourceImpl.java:303)
	at io.harness.pms.pipeline.resource.PipelineResourceImpl$$EnhancerByGuice$$462297078.GUICE$TRAMPOLINE(<generated>)
	at com.google.inject.internal.InterceptorStackCallback$InterceptedMethodInvocation.proceed(InterceptorStackCallback.java:74)
	at io.harness.accesscontrol.NGAccessControlCheckHandler.invoke(NGAccessControlCheckHandler.java:67)
	at com.google.inject.internal.InterceptorStackCallback$InterceptedMethodInvocation.proceed(InterceptorStackCallback.java:75)
	at com.google.inject.internal.InterceptorStackCallback.invoke(InterceptorStackCallback.java:55)
	at io.harness.pms.pipeline.resource.PipelineResourceImpl$$EnhancerByGuice$$462297078.getPipelineByIdentifier(<generated>)
	at jdk.internal.reflect.GeneratedMethodAccessor1063.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:52)
	at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:124)
	at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:167)
	at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:219)
	at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:79)
	at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:475)
	at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:397)
	at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:81)
	at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:255)
	at org.glassfish.jersey.internal.Errors$1.call(Errors.java:248)
	at org.glassfish.jersey.internal.Errors$1.call(Errors.java:244)
	at org.glassfish.jersey.internal.Errors.process(Errors.java:292)
	at org.glassfish.jersey.internal.Errors.process(Errors.java:274)
	at org.glassfish.jersey.internal.Errors.process(Errors.java:244)
	at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:265)
	at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:234)
	at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:680)
	at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:394)
	at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:346)
	at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:366)
	at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:319)
	at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:205)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:799)
	at org.eclipse.jetty.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1656)
	at io.dropwizard.servlets.ThreadNameFilter.doFilter(ThreadNameFilter.java:35)
	at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
	at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1626)
	at io.dropwizard.jersey.filter.AllowedMethodsFilter.handle(AllowedMethodsFilter.java:47)
	at io.dropwizard.jersey.filter.AllowedMethodsFilter.doFilter(AllowedMethodsFilter.java:41)
	at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
	at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1626)
	at io.harness.filter.HttpServiceLoopDetectionFilter.doFilter(HttpServiceLoopDetectionFilter.java:52)
	at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
	at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1626)
	at org.eclipse.jetty.servlets.CrossOriginFilter.handle(CrossOriginFilter.java:319)
	at org.eclipse.jetty.servlets.CrossOriginFilter.doFilter(CrossOriginFilter.java:273)
	at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
	at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1626)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:552)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1440)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:505)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1355)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
	at com.codahale.metrics.jetty9.InstrumentedHandler.handle(InstrumentedHandler.java:313)
	at io.dropwizard.jetty.RoutingHandler.handle(RoutingHandler.java:52)
	at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:772)
	at org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:181)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
	at org.eclipse.jetty.server.Server.handle(Server.java:516)
	at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:487)
	at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:732)
	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:479)
	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
	at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
	at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:409)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034)
	at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: io.grpc.netty.shaded.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: Connection refused: internal-headless/x.x.x.x:<port>
Caused by: java.net.ConnectException: finishConnect(..) failed: Connection refused
	at io.grpc.netty.shaded.io.netty.channel.unix.Errors.newConnectException0(Errors.java:166)
	at io.grpc.netty.shaded.io.netty.channel.unix.Errors.handleConnectErrno(Errors.java:131)
	at io.grpc.netty.shaded.io.netty.channel.unix.Socket.finishConnect(Socket.java:359)
	at io.grpc.netty.shaded.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.doFinishConnect(AbstractEpollChannel.java:710)
	at io.grpc.netty.shaded.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:687)
	at io.grpc.netty.shaded.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:567)
	at io.grpc.netty.shaded.io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:499)
	at io.grpc.netty.shaded.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:407)
	at io.grpc.netty.shaded.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
	at io.grpc.netty.shaded.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.grpc.netty.shaded.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Thread.java:833)

at same time, the IP coming in error was scaling down via HPA.

Jul 04 '24 12:07 archit-harness

@ejona86 any recommendation for the above?

Jul 08 '24 07:07 archit-harness

finishConnect(..) failed: Connection refused

Failure to connect to shutting down pods is expected. The server.shutdown() will start refusing new connections. But that failing RPCs is not what we want. We want it to use a different address.

Are you using minReadySeconds? As I mentioned before:

If you RPCs failing with connection refused that means all addresses aren't working, which likely means all the addresses are old. minReadySeconds of 30 seconds could indeed help in that situation.

Jul 08 '24 17:07 ejona86

@ejona86 the minReadySeconds will help during rolling update right? Above issue as i confirmed is also occurring in scale down during HPA, so how will it help?

Jul 08 '24 18:07 archit-harness

Above issue as i confirmed is also occurring in scale down during HPA, so how will it help?

Scale-downs should be fine because there's no new addresses. The client will try to connect to the old pod and that will fail, but the client will use the still-running pods instead.

To get the "finishConnect(..) failed: Connection refused" error, the client must not be aware of any still-available pod's address. That's a scale-up issue more than a scale-down issue. Although in a rolling restart you're doing both at the same time.

Jul 08 '24 18:07 ejona86

Hi @ejona86 i think i missed your complete context, sharing all details and share my problem.

We are getting failures at both a) Rolling Update b) HPA scale down of pods

The errors we are getting are shared above. Our setup has 30s sleep in our prestop hooks with terminationGracePeriod of 180s. Thus, old connection should continue to work theoretically.

We also have MAX_CONNECTION_AGE = 30s on servers as recommended.

Apart from it, we tried using minReadySeconds = 30s, i think its helping with rolling update, we are confirming again on this.

But still we are faced with issue of old connection getting UNAVAILABLE, or

io.grpc.StatusRuntimeException: UNAVAILABLE: Connection closed after GOAWAY. HTTP/2 error code: NO_ERROR, debug data: max_age
	at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:268)
	at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:249)
	at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:167)
	at io.harness.delegate.DelegateServiceGrpc$DelegateServiceBlockingStub.submitTaskV2(DelegateServiceGrpc.java:905)
	at io.harness.pms.utils.PmsGrpcClientUtils.lambda$retryAndProcessException$0(PmsGrpcClientUtils.java:39)
	at net.jodah.failsafe.Functions.lambda$resultSupplierOf$11(Functions.java:284)
	at net.jodah.failsafe.RetryPolicyExecutor.lambda$supply$0(RetryPolicyExecutor.java:63)
	at net.jodah.failsafe.Execution.executeSync(Execution.java:129)
	at net.jodah.failsafe.FailsafeExecutor.call(FailsafeExecutor.java:379)
	at net.jodah.failsafe.FailsafeExecutor.get(FailsafeExecutor.java:70)
	at io.harness.pms.utils.PmsGrpcClientUtils.retryAndProcessException(PmsGrpcClientUtils.java:39)

Or ConnectionRefused.

Is the solution to them be retry after at least 30s so DNS refreshes and we get new IPs? Shouldnt Lib automatically check and pick new IP from DNS and not be considered as retry?

Also, if yes, should we prefer application level retry using jodah failsafe or we found retry configs in grpc-java as well.

Jul 09 '24 05:07 archit-harness

Hi @ejona86 any help is appreciated here

Jul 11 '24 05:07 archit-harness

UNAVAILABLE: Connection closed after GOAWAY. HTTP/2 error code: NO_ERROR, debug data: max_age

This most likely means RPCs are living longer than the server's shutdown sequence. RPCs need to complete more quickly or your servers need to allow waiting longer for graceful termination.

Or ConnectionRefused

Is the rate of this at least better now? A complicating factor may be that MAX_CONNECTION_AGE has jitter of +/- 10%. The worst-case timing for a fresh DNS request is ~1 minute, because MAX_CONNECTION_AGE and the DNS cache could align in perfectly the wrong way.

The problem is really the Java DNS cache. Let's set MAX_CONNECTION_AGE to 35s (so after the jitter is added/removed the value is guaranteed greater than 30s), and minReadySeconds = 40s to give MAX_CONNECTION_AGE some time to do the DNS query (36s would be fine, but 40s doesn't seem like it hurts anything and gives more time).

I'm hoping that helps. Without seeing more what is happening, I'm shooting in the dark a bit.

Jul 16 '24 17:07 ejona86

Hi @ejona86 we also noticed one thing, when we are getting UNAVAILABLE status code, we are retrying from our application.

Our retry policy is -

return new RetryPolicy<>()
        .withBackoff(100, 2)
        .withMaxAttempts(3)
        .onFailedAttempt(event
            -> log.warn(
                String.format("Pms sdk grpc retry attempt: %d", event.getAttemptCount()), event.getLastFailure()))
        .onFailure(event
            -> log.error(String.format("Pms sdk grpc retry failed after attempts: %d", event.getAttemptCount()),
                event.getFailure()))
        .handleIf(throwable -> {
          if (!(throwable instanceof StatusRuntimeException)) {
            return false;
          }
          StatusRuntimeException statusRuntimeException = (StatusRuntimeException) throwable;
          return statusRuntimeException.getStatus().getCode() == Status.Code.UNAVAILABLE
              || statusRuntimeException.getStatus().getCode() == Status.Code.UNKNOWN
              || statusRuntimeException.getStatus().getCode() == Status.Code.DEADLINE_EXCEEDED
              || statusRuntimeException.getStatus().getCode() == Status.Code.RESOURCE_EXHAUSTED;
        });

And on each retry we can see the IP coming in error is always same, why the grpc client is doing round robin across IPs? This we have seen in multiple cases to confirm only same IP is getting used.

According to my understanding each grpc call would kinda do round robin. Now, it could happen, that at same time other grpc calls are also happening and those are having new IPs but this coincidence is observed in each error i have monitored.

Any suggestions, or is my understanding wrong?

Jul 24 '24 03:07 archit-harness

How are you configuring to use round_robin? Are you configuring that via defaultServiceConfig or defaultLoadBalancingPolicy?

Jul 24 '24 16:07 ejona86

@ejona86 i have currently set both the fields

defaultServiceConfig = Map.of("LoadBalancingConfig", Map.of("round_robin", new HashMap<>()));

and defaultLoadBalancingPolicy = "round_robin"

I read some threads today that maybe this is wrong.

Still figuring it out. Can you help me for the same? First i implemented as suggested here - https://github.com/grpc/grpc-java/issues/11151#issuecomment-2101903880 but it wasn't working, but seems like even after setting both of them, its not working.

Jul 24 '24 16:07 archit-harness

grpc-java grpc-java copied to clipboard

Thick client side load balancing without using load balancer

grpc-java
grpc-java copied to clipboard