EventStoreDB-Client-Java icon indicating copy to clipboard operation
EventStoreDB-Client-Java copied to clipboard

Should DNS discovery use all IPs in a multi-address DNS name as cluster seeds?

Open lbodor opened this issue 1 year ago • 11 comments
trafficstars

My experience is that DNS discovery fails when 1 node out of 3 is down, and discovery spends all of maxDiscoverAttempts trying to get gossip from the node that is down, instead of also considering the other 2 nodes' IPs registered with a multi-address DNS name.

I was able to implement the behaviour I expect like this

# ClusterDiscovery.java:

    void discover(ConnectionState state) {
-       List<InetSocketAddress> candidates = new ArrayList<>(this.seeds);
+       List<InetSocketAddress> candidates = new ArrayList<>();
+
+       if (state.getSettings().isDnsDiscover()) {
+           try {
+               InetSocketAddress dnsSeed = this.seeds.get(0);
+
+               // Resolve cluster DNS name
+               candidates = Arrays.stream(InetAddress.getAllByName(dnsSeed.getHostName()))
+                   .map(addr -> new InetSocketAddress(addr, dnsSeed.getPort()))
+                   .collect(Collectors.toList());
+                  
+           } catch (UnknownHostException e) {
+               throw new UncheckedIOException(e);
+           }
+       } else {
+           candidates = new ArrayList<>(this.seeds);
+       }

        if (candidates.size() > 1) {

I'm not sure, however, if you'd prefer to delegate somehow this behaviour to the gRPC client, since it's the gRPC client that currently does the lookup.

lbodor avatar May 28 '24 02:05 lbodor

Hey @lbodor,

If what you describe is true then that's a bug on our part. You shouldn't have to do all this. Let me get back to you after I conduct some investigation.

Thanks for taking the time to reach out.

YoEight avatar May 28 '24 02:05 YoEight

Hey @lbodor

I did my investigation on the matter and I'd would like you to confirm a few things first. Did you set your connection string with esdb+discover:// or if you use the builder configuration, did you set dnsDiscover(true) and submitted more than one endpoint/seed as well?

We used to support A DNS queries a long time ago when there were only TCP clients. We stopped doing it because configuring a DNS properly is not given to everybody. A suggestion in your case would be to register all your nodes in your DNS like you did but to have your DNS to pick randomly/roundrobin a node when the main domain is queried.

YoEight avatar Jun 26 '24 20:06 YoEight

Thanks for getting back to me. Here is how I connect.

EventStoreDBClient.create(
     EventStoreDBClientSettings.builder()
        .dnsDiscover(true)
        .addHost(hostname, 443)
        .tls(true)
        .buildConnectionSettings()
);

Since DNS discovery is true and hostname resolves to 3 IP addresses, I'd expect all 3 to be used as gossip seeds. This is documented for cluster-side node discovery (https://developers.eventstore.com/server/v24.2/cluster.html#cluster-with-dns), and it seems fair for it to also work in the client.

Thanks for suggesting round-robin, but I think it would be generally unreliable, since it would require short TTL, which recursive DNS servers can ignore, if they consider it too short. I have tried it in Route53, and for the same DNS configuration, I'm getting very different results between running the client at work vs at home, probably due to different caching behaviour of recursive DNS servers between the client and the authoritative DNS. It would work, if users were to configure DNS resolution on the client to go straight to the authoritative DNS server.

What is the downside of applying something like the above patch to ClusterDiscovery.java? If multivalue records are meant to work during cluster-side node discovery, then that is for the benefit of users who have already managed to configure DNS correctly. It seems a smaller requirement of users' DNS skills, than getting round-robin, TTL, discovery timeouts, and path to authoritative DNS all to line up.

lbodor avatar Jun 27 '24 09:06 lbodor

Hey @lbodor

Since DNS discovery is true and hostname resolves to 3 IP addresses, I'd expect all 3 to be used as gossip seeds. This is documented for cluster-side node discovery (https://developers.eventstore.com/server/v24.2/cluster.html#cluster-with-dns), and it seems fair for it to also work in the client.

And that should be the case. Could you provide some logs showing the client not going for other members of the cluster if it fails to connect to the first seed? By logs, I mean those emitted by the java client.

YoEight avatar Jul 06 '24 18:07 YoEight

$ host nodes.dev-xxx
nodes.dev-xxx has address 13.55.106.44
nodes.dev-xxx has address 3.104.208.149
nodes.dev-xxx has address 52.64.13.213

Node 13.55.106.44 is down, others are up. The issue is present when DNS resolves with 13.55.106.44 as first item in the list.

2024-07-08 10:39:35,963 [DEBUG] [] [] [] [esdb-client-4ff4b4e0-38a4-4030-a0ef-be30fde11ae6] [com.eventstore.dbclient.ConnectionService] Start connection attempt (1/3)
2024-07-08 10:39:35,963 [DEBUG] [] [] [] [ForkJoinPool.commonPool-worker-1] [com.eventstore.dbclient.ClusterDiscovery] Using seed node [nodes.dev-xxx/13.55.106.44:443] for cluster node discovery.
2024-07-08 10:39:38,054 [ERROR] [] [] [] [ForkJoinPool.commonPool-worker-1] [com.eventstore.dbclient.ClusterDiscovery] java.util.concurrent.ExecutionException: io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: deadline exceeded after 1.941476417s. Name resolution delay 0.000000000 seconds. [closed=[], open=[[buffered_nanos=1944060611, waiting_for_connection]]]
	at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2096)
	at com.eventstore.dbclient.ClusterDiscovery.discover(ClusterDiscovery.java:60)
	at com.eventstore.dbclient.ClusterDiscovery.lambda$run$2(ClusterDiscovery.java:42)
	at java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804)
	at java.base/java.util.concurrent.CompletableFuture$AsyncRun.exec(CompletableFuture.java:1796)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:507)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1491)
	at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:2073)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:2035)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:187)
Caused by: io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: deadline exceeded after 1.941476417s. Name resolution delay 0.000000000 seconds. [closed=[], open=[[buffered_nanos=1944060611, waiting_for_connection]]]
	at io.grpc.Status.asRuntimeException(Status.java:533)
	at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:481)
	at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:574)
	at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:72)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:742)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:723)
	at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
	at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1570)
Exception during the node selection process
2024-07-08 10:39:38,059 [ERROR] [] [] [] [esdb-client-4ff4b4e0-38a4-4030-a0ef-be30fde11ae6] [com.eventstore.dbclient.ConnectionService] java.util.concurrent.ExecutionException: com.eventstore.dbclient.NoClusterNodeFoundException
	at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073)
	at com.eventstore.dbclient.ConnectionService.createChannel(ConnectionService.java:130)
	at com.eventstore.dbclient.ConnectionService.process(ConnectionService.java:170)
	at com.eventstore.dbclient.RunWorkItem.accept(RunWorkItem.java:30)
	at com.eventstore.dbclient.ConnectionService.run(ConnectionService.java:46)
	at java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1570)
Caused by: com.eventstore.dbclient.NoClusterNodeFoundException: null
	at com.eventstore.dbclient.ClusterDiscovery.discover(ClusterDiscovery.java:79)
	at com.eventstore.dbclient.ClusterDiscovery.lambda$run$2(ClusterDiscovery.java:42)
	at java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804)
	at java.base/java.util.concurrent.CompletableFuture$AsyncRun.exec(CompletableFuture.java:1796)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:507)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1491)
	at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:2073)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:2035)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:187)
Error when running discovery process
2024-07-08 10:39:38,060 [DEBUG] [] [] [] [esdb-client-4ff4b4e0-38a4-4030-a0ef-be30fde11ae6] [com.eventstore.dbclient.ConnectionService] Start connection attempt (2/3)
2024-07-08 10:39:38,061 [DEBUG] [] [] [] [ForkJoinPool.commonPool-worker-1] [com.eventstore.dbclient.ClusterDiscovery] Using seed node [nodes.dev-xxx/13.55.106.44:443] for cluster node discovery.
2024-07-08 10:39:40,071 [ERROR] [] [] [] [ForkJoinPool.commonPool-worker-1] [com.eventstore.dbclient.ClusterDiscovery] java.util.concurrent.TimeoutException: null
	at java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1960)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2095)
	at com.eventstore.dbclient.ClusterDiscovery.discover(ClusterDiscovery.java:60)
	at com.eventstore.dbclient.ClusterDiscovery.lambda$run$2(ClusterDiscovery.java:42)
	at java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804)
	at java.base/java.util.concurrent.CompletableFuture$AsyncRun.exec(CompletableFuture.java:1796)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:507)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1491)
	at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:2073)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:2035)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:187)
Exception during the node selection process
2024-07-08 10:39:40,072 [ERROR] [] [] [] [esdb-client-4ff4b4e0-38a4-4030-a0ef-be30fde11ae6] [com.eventstore.dbclient.ConnectionService] java.util.concurrent.ExecutionException: com.eventstore.dbclient.NoClusterNodeFoundException
	at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073)
	at com.eventstore.dbclient.ConnectionService.createChannel(ConnectionService.java:130)
	at com.eventstore.dbclient.ConnectionService.process(ConnectionService.java:170)
	at com.eventstore.dbclient.RunWorkItem.accept(RunWorkItem.java:30)
	at com.eventstore.dbclient.ConnectionService.run(ConnectionService.java:46)
	at java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1570)
Caused by: com.eventstore.dbclient.NoClusterNodeFoundException: null
	at com.eventstore.dbclient.ClusterDiscovery.discover(ClusterDiscovery.java:79)
	at com.eventstore.dbclient.ClusterDiscovery.lambda$run$2(ClusterDiscovery.java:42)
	at java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804)
	at java.base/java.util.concurrent.CompletableFuture$AsyncRun.exec(CompletableFuture.java:1796)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:507)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1491)
	at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:2073)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:2035)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:187)
Error when running discovery process
2024-07-08 10:39:40,074 [DEBUG] [] [] [] [esdb-client-4ff4b4e0-38a4-4030-a0ef-be30fde11ae6] [com.eventstore.dbclient.ConnectionService] Start connection attempt (3/3)
2024-07-08 10:39:40,074 [DEBUG] [] [] [] [ForkJoinPool.commonPool-worker-1] [com.eventstore.dbclient.ClusterDiscovery] Using seed node [nodes.dev-xxx/13.55.106.44:443] for cluster node discovery.
2024-07-08 10:39:42,078 [ERROR] [] [] [] [ForkJoinPool.commonPool-worker-1] [com.eventstore.dbclient.ClusterDiscovery] java.util.concurrent.ExecutionException: io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: deadline exceeded after 1.997705374s. Name resolution delay 0.000000000 seconds. [closed=[], open=[[buffered_nanos=1999385336, waiting_for_connection]]]
	at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2096)
	at com.eventstore.dbclient.ClusterDiscovery.discover(ClusterDiscovery.java:60)
	at com.eventstore.dbclient.ClusterDiscovery.lambda$run$2(ClusterDiscovery.java:42)
	at java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804)
	at java.base/java.util.concurrent.CompletableFuture$AsyncRun.exec(CompletableFuture.java:1796)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:507)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1491)
	at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:2073)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:2035)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:187)
Caused by: io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: deadline exceeded after 1.997705374s. Name resolution delay 0.000000000 seconds. [closed=[], open=[[buffered_nanos=1999385336, waiting_for_connection]]]
	at io.grpc.Status.asRuntimeException(Status.java:533)
	at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:481)
	at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:574)
	at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:72)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:742)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:723)
	at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
	at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1570)
Exception during the node selection process
2024-07-08 10:39:42,080 [ERROR] [] [] [] [esdb-client-4ff4b4e0-38a4-4030-a0ef-be30fde11ae6] [com.eventstore.dbclient.ConnectionService] java.util.concurrent.ExecutionException: com.eventstore.dbclient.NoClusterNodeFoundException
	at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073)
	at com.eventstore.dbclient.ConnectionService.createChannel(ConnectionService.java:130)
	at com.eventstore.dbclient.ConnectionService.process(ConnectionService.java:170)
	at com.eventstore.dbclient.RunWorkItem.accept(RunWorkItem.java:30)
	at com.eventstore.dbclient.ConnectionService.run(ConnectionService.java:46)
	at java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1570)
Caused by: com.eventstore.dbclient.NoClusterNodeFoundException: null
	at com.eventstore.dbclient.ClusterDiscovery.discover(ClusterDiscovery.java:79)
	at com.eventstore.dbclient.ClusterDiscovery.lambda$run$2(ClusterDiscovery.java:42)
	at java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804)
	at java.base/java.util.concurrent.CompletableFuture$AsyncRun.exec(CompletableFuture.java:1796)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:507)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1491)
	at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:2073)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:2035)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:187)
Error when running discovery process
2024-07-08 10:39:42,081 [ERROR] [] [] [] [esdb-client-4ff4b4e0-38a4-4030-a0ef-be30fde11ae6] [com.eventstore.dbclient.ConnectionService] Maximum discovery attempt count reached: 3

lbodor avatar Jul 08 '24 00:07 lbodor

Hi @YoEight, does the above log show what you wanted to see? Are you able to reproduce the problem?

lbodor avatar Sep 18 '24 04:09 lbodor

Hey @lbodor, sorry for not getting back to you sooner. I totally forgot about that issue since I was back from vacations. When reading my notes, I wasn't able to reproduce it and concluded you might have a different DNS configuration than what we expected with our connection code.

Is your java application running with a security manager or a custom security policy that restricts DNS lookups?

YoEight avatar Sep 19 '24 01:09 YoEight

I think that our DNS setup is pretty standard, provided by AWS Route53. Here it is as shown by aws route53 list-list-resource-record-sets

...
{
    "Name": "nodes.dev-xxx.",
    "Type": "A",
    "SetIdentifier": "0",
    "MultiValueAnswer": true,
    "TTL": 300,
    "ResourceRecords": [
        {
            "Value": "3.105.10.xxx"
        }
    ]
},
{
    "Name": "nodes.dev-xxx.",
    "Type": "A",
    "SetIdentifier": "1",
    "MultiValueAnswer": true,
    "TTL": 300,
    "ResourceRecords": [
        {
            "Value": "3.107.137.xxx"
        }
    ]
},
{
    "Name": "nodes.dev-xxx.",
    "Type": "A",
    "SetIdentifier": "2",
    "MultiValueAnswer": true,
    "TTL": 300,
    "ResourceRecords": [
        {
            "Value": "3.24.251.xxx"
        }
    ]
},
...

Would the fact that my proposed change to your ClusterDiscovery.java shown above demonstrate that there isn't anything restricting DNS lookups, since the change fixes my issue?

Our cluster is publicly accessible, so if you'd like I can share our DNS name with you (maybe via email), and you could send me your Java client's log output.

Alternatively, if possible, you could send me a DNS name of one of your public test clusters, and I could try from my end. I would use iptables locally to reject outbound traffic to one of your nodes.

lbodor avatar Sep 19 '24 02:09 lbodor

If your cluster is publicly accessible, please send me the details. I'll do my best to identify the root cause.

YoEight avatar Sep 19 '24 03:09 YoEight

OK, thanks. I'll send you the details shortly via DM on discord to keep the DNS name private.

lbodor avatar Sep 19 '24 04:09 lbodor

On second thought, I made a temporary DNS multiaddress A record, which I can share here.

$ host es-tmp-1.geodesy.ga.gov.au
es-tmp-1.geodesy.ga.gov.au has address 3.24.251.198
es-tmp-1.geodesy.ga.gov.au has address 3.105.10.86
es-tmp-1.geodesy.ga.gov.au has address 3.107.137.137

The IP addresses are random; there is no cluster, it's not needed to demonstrate the issue.

Without the above patch, cluster discovery resolves the DNS, but uses only the first IP returned, not all 3.

2024-09-23 06:41:11,119 [DEBUG] [c.e.dbclient.ConnectionService] Start connection attempt (1/3)
2024-09-23 06:41:11,119 [DEBUG] [c.e.dbclient.ClusterDiscovery] Using seed node [es-tmp-1.geodesy.ga.gov.au/3.24.251.198:443] for cluster node discovery.
2024-09-23 06:41:14,205 [ERROR] ...
2024-09-23 06:41:14,707 [DEBUG] [c.e.dbclient.ConnectionService] Start connection attempt (2/3)
2024-09-23 06:41:14,708 [DEBUG] [c.e.dbclient.ClusterDiscovery] Using seed node [es-tmp-1.geodesy.ga.gov.au/3.24.251.198:443] for cluster node discovery.
2024-09-23 06:41:17,712 [ERROR] ...
2024-09-23 06:41:18,213 [DEBUG] [c.e.dbclient.ConnectionService] Start connection attempt (3/3)
2024-09-23 06:41:18,213 [DEBUG] [c.e.dbclient.ConnectionService] Using seed node [es-tmp-1.geodesy.ga.gov.au/3.24.251.198:443] for cluster node discovery.
2024-09-23 06:41:21,215 [ERROR] ...
2024-09-23 06:41:21,716 [ERROR] [c.e.dbclient.ConnectionService] Maximum discovery attempt count reached: 3

With the above patch, cluster discovery resolves the DNS, grabs all 3 IPs, and tries them each 3 times.

2024-09-23 07:41:51,561 [DEBUG] [c.e.dbclient.ConnectionService] Start connection attempt (1/3)
2024-09-23 07:41:51,562 [DEBUG] [c.e.dbclient.ClusterDiscovery] Using seed node [es-tmp-1.geodesy.ga.gov.au/3.24.251.198:443] for cluster node discovery.
2024-09-23 07:41:54,628 [ERROR] ...
2024-09-23 07:41:54,629 [DEBUG] [c.e.dbclient.ClusterDiscovery] Using seed node [es-tmp-1.geodesy.ga.gov.au/3.105.10.86:443] for cluster node discovery.
2024-09-23 07:41:54,743 [ERROR] ...
2024-09-23 07:41:54,744 [DEBUG] [c.e.dbclient.ClusterDiscovery] Using seed node [es-tmp-1.geodesy.ga.gov.au/3.107.137.137:443] for cluster node discovery.
2024-09-23 07:41:57,748 [ERROR] ...
2024-09-23 07:41:58,250 [DEBUG] [c.e.dbclient.ConnectionService] Start connection attempt (2/3)
2024-09-23 07:41:58,251 [DEBUG] [c.e.dbclient.ClusterDiscovery] Using seed node [es-tmp-1.geodesy.ga.gov.au/3.105.10.86:443] for cluster node discovery.
2024-09-23 07:41:58,343 [ERROR] ...
2024-09-23 07:41:58,344 [DEBUG] [c.e.dbclient.ClusterDiscovery] Using seed node [es-tmp-1.geodesy.ga.gov.au/3.107.137.137:443] for cluster node discovery.
2024-09-23 07:42:01,349 [ERROR] ...
2024-09-23 07:42:01,350 [DEBUG] [c.e.dbclient.ClusterDiscovery] Using seed node [es-tmp-1.geodesy.ga.gov.au/3.24.251.198:443] for cluster node discovery.
2024-09-23 07:42:04,355 [ERROR] ...
2024-09-23 07:42:04,856 [DEBUG] [c.e.dbclient.ConnectionService] Start connection attempt (3/3)
2024-09-23 07:42:04,857 [DEBUG] [c.e.dbclient.ClusterDiscovery] Using seed node [es-tmp-1.geodesy.ga.gov.au/3.107.137.137:443] for cluster node discovery.
2024-09-23 07:42:07,861 [ERROR] ...
2024-09-23 07:42:07,861 [DEBUG] [c.e.dbclient.ClusterDiscovery] Using seed node [es-tmp-1.geodesy.ga.gov.au/3.105.10.86:443] for cluster node discovery.
2024-09-23 07:42:07,938 [ERROR] ...
2024-09-23 07:42:07,938 [DEBUG] [c.e.dbclient.ClusterDiscovery] Using seed node [es-tmp-1.geodesy.ga.gov.au/3.24.251.198:443] for cluster node discovery.
2024-09-23 07:42:10,940 [ERROR] ...
2024-09-23 07:42:11,442 [ERROR] [c.e.dbclient.ConnectionService] Maximum discovery attempt count reached: 3

Please let me know if you can reproduce the issue using es-tmp-1.geodesy.ga.gov.au.

lbodor avatar Sep 22 '24 21:09 lbodor

Hey @lbodor ,

I wasn't able to reproduce your error because your server is always unavailable when I try to reach it.

host es-tmp-1.geodesy.ga.gov.au
es-tmp-1.geodesy.ga.gov.au has address 3.105.10.86
es-tmp-1.geodesy.ga.gov.au has address 3.107.137.137
es-tmp-1.geodesy.ga.gov.au has address 3.24.251.198
ping -c 3 es-tmp-1.geodesy.ga.gov.au
PING es-tmp-1.geodesy.ga.gov.au (3.105.10.86) 56(84) bytes of data.

--- es-tmp-1.geodesy.ga.gov.au ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2025ms
curl https://es-tmp-1.geodesy.ga.gov.au:443/gossip
curl: (60) SSL: no alternative certificate subject name matches target host name 'es-tmp-1.geodesy.ga.gov.au'
More details here: https://curl.se/docs/sslcerts.html

curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.

YoEight avatar Sep 27 '24 02:09 YoEight

You don't need a cluster to reproduce the issue. All you need is a DNS record that resolves to 3 IP addresses. The issue is that the Java client doesn't use all 3 IP addresses returned from DNS as cluster seeds, but just one.

All 3 IPs returned by es-tmp-1.geodesy.ga.gov.au are down, but the difference is between the cluster discovery trying to get gossip from only one node and giving up, or it trying to get gossip from all 3 nodes and then giving up, as the log above shows.

If only the first node returned by es-tmp-1.geodesy.ga.gov.au were down, and the other 2 up, how would the client know, if it never tries to get gossip from the 2 nodes that are up.

lbodor avatar Sep 27 '24 03:09 lbodor

I believe there is a misunderstanding on how the node discovery works. Yes, in your case it will only resolve one. As I said earlier

We used to support A DNS queries a long time ago when there were only TCP clients. We stopped doing it because configuring a DNS properly is not given to everybody.

In your case, the DNS should at least use round-robin for IP selection. However, this doesn't change the fact that the Java client won't handle node resolution on its own. This is not the end of the node discovery process. Once an IP is selected, the client will call the /gossip endpoint to retrieve information about all nodes in the cluster (including their respective IPs). Node selection and shuffling will happen based on your preferences at that stage, not during the DNS resolution.

If your DNS uses round-robin for IP selection, it shouldn't be an issue since the Java client would eventually access all the IPs through multiple connection attempts.

YoEight avatar Sep 27 '24 04:09 YoEight

From https://developers.eventstore.com/server/v24.6/cluster.html#cluster-with-dns:

When you tell EventStoreDB to use DNS for its gossip, the server will resolve
the DNS name to a list of IP addresses and connect to each of those addresses
to find other nodes.

Python client docs also say that gossip documents will be requested from all DNS A records.

From https://pypi.org/project/esdbclient:

The client will use the cluster domain name with the gRPC library's 'round robin' load balancing
strategy to call the Gossip APIs of addresses discovered from DNS 'A' records.

Here is a test that shows that the Python client does consider all 3 IPs:

GRPC_VERBOSITY=debug GRPC_TRACE=all python3 test.py 2>&1 | grep -e "CLIENT_CONNECT"
I0927 06:22:20.487756977    3330 tcp_client_posix.cc:391]              CLIENT_CONNECT: ipv4:3.24.251.198:2113: asynchronously connecting fd 0x7f93d8013180
I0927 06:22:20.487888677    3330 tcp_client_posix.cc:391]              CLIENT_CONNECT: ipv4:3.107.137.137:2113: asynchronously connecting fd 0x7f93d8020a20
I0927 06:22:20.488005372    3330 tcp_client_posix.cc:391]              CLIENT_CONNECT: ipv4:3.105.10.86:2113: asynchronously connecting fd 0x7f93d802c630

Round-robin DNS is unsuitable for client-server connection fail-over, as I've explained in a previous comment. This is well documented, for instance, see https://en.wikipedia.org/wiki/Round-robin_DNS.

I take your point that once the client has gotten hold of a gossip document from any one node, it will extract from it all the nodes' IPs as advertised by the cluster. But when round-robin is not configured or not happening fast enough, discovery will fail.

Why not implement DNS resolution like you say was done in the old TCP client? It is only a few lines of code, and the Java gRPC client would be strictly better for it, because discovery would work reliably for more users. For us, using Route53, DNS records rotate sometimes every few seconds and sometime not for as long as 5 minutes, we have no control over this.

Would you please be kind enough to revisit this issue together with your colleagues.

lbodor avatar Sep 27 '24 07:09 lbodor

@lbodor You’ve chosen the only gRPC client that explicitly performs DNS resolution through lookup queries. While the Python client is officially maintained by the company, it’s not directly managed by our organization, and its author has considerable flexibility in its implementation, provided it remains compatible with the gRPC server interface.

We’re unable to adopt your suggestion because, first, it would introduce a breaking change. Second, in nearly five years, you’re the only one who has encountered this issue.

What I propose instead is to offer an option to select this specific DNS resolution implementation via a feature flag. We could expose this through both the connection string and the connection builder. How does that sound?

YoEight avatar Sep 28 '24 21:09 YoEight

That would be great. Thanks very much.

lbodor avatar Sep 29 '24 21:09 lbodor

@lbodor as soon as that PR get merged, I'll cut a new release right after.

YoEight avatar Oct 06 '24 04:10 YoEight

@YoEight, awesome, thanks. I've tested the branch in the PR, and it works for me as expected.

lbodor avatar Oct 07 '24 00:10 lbodor

@lbodor The feature is available in the 5.4.2 release.

YoEight avatar Oct 16 '24 21:10 YoEight

@YoEight, thanks, we've just updated to the new version.

lbodor avatar Oct 17 '24 00:10 lbodor

We’re unable to adopt your suggestion because, first, it would introduce a breaking change. Second, in nearly five years, you’re the only one who has encountered this issue.

I think the reason why this is not a bigger issue is because of two things.

  1. It is kind of an edge case - It is only an issue if the gossip node never recovers and the client disconnected
  2. It can be solved with different workarounds, such as not using discovery or restarting a client automatically

TLDR: we have also experienced this issue - but used workarounds. I will test the feature flag.


Whilst looking through the code for the other issue I had I decided to look at why DNS discovery did not work as we expected and discovered this change and discussion. I am sorry for necroing this issue, I didn't want to start a new issue just to give my two cents given that a fix has been put in place for this, but as you mentioned you had not heard of any others who had this problem I thought it was a good idea to mention that we have also had issues with this.

I would expect DNS discovery to work as it does with the feature flag enabled. However, as it does not, we have mainly ignored that functionality and solved it by providing all hosts in the connection string.

We can use the feature flag to solve this, but I also want to point out that I don't think the "Deferred" solution works as expected. When you create the InetSocketAddress in the ConnectionSettingsBuilder it resolves the first IP in the IP range of the DNS record:

// java.net.InetSocketAddress#InetSocketAddress(java.lang.String, int)
public InetSocketAddress(String hostname, int port) {
    checkHost(hostname);
    InetAddress addr = null;
    String host = null;
    try {
        addr = InetAddress.getByName(hostname);
    } catch(UnknownHostException e) {
        host = hostname;
    }
    holder = new InetSocketAddressHolder(host, addr, checkPort(port));
}

// java.net.InetAddress#getByName(java.lang.String)
public static InetAddress getByName(String host)
    throws UnknownHostException {
    return InetAddress.getAllByName(host)[0];
}

My understanding is that from this point forward that IP will be the one to used to try to resolve the cluster, it does not use the host address to get a random IP from the DNS record so round robin would not work either. If the selected node is dead, it will never attempt to connect to any other node.

This is a bit tricky to test, as you have to have a correctly setup DNS record and a clustered setup, but if you kill the leader node (as well as the "locked" node if those are not the same) and it never recovers the client will never reconnect. I have tested this in our staging environment and confirmed that this is the case.

I don't expect any change with this, as I mentioned above I just wanted to highlight that Ibodor is not alone in experiencing this issue.

I do however suggest that you rename the DeprecatedNodeResolution to something else, as I believe it is a valid connection strategy and the name makes it sound like something to be avoided.

marcuslyth avatar Jan 13 '25 08:01 marcuslyth

Chiming in - we also running into this issue. It seems the feature flagged behavior should be enabled by default.

antonmos avatar Apr 30 '25 20:04 antonmos

DeferredNodeResolution is also a very misleading name, because the behavior is actually eager resolution. When it is created , the InetSocketAddress has already been instatiated and the name has already been resolved.

Note that InetSocketAddress does not through an exception if the name cannot be resolved (see public InetSocketAddress(String hostname, int port) constructor.

edit: just saw that @marcuslyth already said the same thing edit2: in fact, DeprecatedNodeResolution is actually an implementation of the deferred startegy

antonmos avatar Apr 30 '25 22:04 antonmos

Chiming in — we’re also encountering this issue. It does seem like the feature-flagged behavior might make sense as the default going forward.

That said, enabling it by default would constitute a breaking change, and it’s worth noting that this behavior has been in place for over 5 years without much feedback — possibly because many users rely on fixed gossip seeds rather than DNS discovery. We’d likely need broader input before making changes to such a core behavior.

DeferredNodeResolution is also a somewhat misleading name, as the behavior is actually eager resolution.

I agree the naming could be clearer. In this case, “deferred” refers to the resolution being handled by the gRPC stack rather than the strategy implementation itself, which may not be immediately obvious.

YoEight avatar May 01 '25 00:05 YoEight

Apologies for missing your earlier comment, @marcuslyth — I just saw it now.

My recollection of why we went in that direction is a bit hazy, but I believe the decision was made during the early stages of our cloud offering, when we were heavily relying on DNS resolution for cluster setup. At the time, we chose to rely on DNS to handle record shuffling rather than implementing it ourselves. The assumption was that anyone opting for DNS-based discovery would be doing so in environments with dynamic node sets, where DNS is configured to reflect those changes.

If your DNS consistently returns the same set of node IPs, in the same order, it raises the question of why fixed gossip seeds wouldn’t be used instead — that was part of the original rationale as well.

YoEight avatar May 01 '25 00:05 YoEight

DeferredNodeResolution is also a somewhat misleading name, as the behavior is actually eager resolution.

I agree the naming could be clearer. In this case, “deferred” refers to the resolution being handled by the gRPC stack rather than the strategy implementation itself, which may not be immediately obvious.

I dont know the internals of grpc stack, but it doesnt appear to be working as you describe.

When using dnsDiscovery and feature flag off, we see that the IP address that was initially resolved (when EventStoreDBClientSettings parsed the connectionstring into InetSocketAddress) is used indefinitely. This happens even when a new instance of EvenstStoreDBClient is recreated. Looking at EventStoreDBClientBase it appears that a new grpc client is created, so one would think that it would try to resolve the hostname in InetSocketAddress again IF grpc performed dns resolution internally.

antonmos avatar May 01 '25 13:05 antonmos

If you’re referring to the fact that the connection string hostname is foobar:1234, but the InetSocketAddress we retain later only reflects the resolved IP, like 10.20.30.40:1234, then you’re correct — it doesn’t behave as initially described.

If that's the case then it needs to be addressed.

YoEight avatar May 01 '25 14:05 YoEight

@antonmos This PR addresses that issue: https://github.com/kurrent-io/KurrentDB-Client-Java/pull/328

Thank you and @marcuslyth for bringing that issue to light.

YoEight avatar Jun 05 '25 04:06 YoEight