helm-chart icon indicating copy to clipboard operation
helm-chart copied to clipboard

Should NSQd broadcast podIP or podName svc address?

Open wreis opened this issue 1 year ago • 1 comments

I noticed that body from nsqlookupd response contains broadcast_address with podIP value of nsqd instance, which gets stale/invalid as soon nsqd pod restarts.

/ # curl "http://nsq-nsqlookupd.nsq.svc.cluster.local:4161/lookup?channel=chan1&format=json&topic=topic1"
{"channels":["chan1"],"producers":[{"remote_address":"10.15.119.136:48390","hostname":"nsq-nsqd-1","broadcast_address":"10.15.119.136","tcp_port":4150,"http_port":4151,"version":"1.3.0"},{"remote_address":"10.15.82.172:36240","hostname":"nsq-nsqd-0","broadcast_address":"10.15.82.172","tcp_port":4150,"http_port":4151,"version":"1.3.0"}]}

Is this correct behavior in k8s environment where pods are ephemeral and might get restarted? Looks like app clients are connecting on broadcast_address (confirmed it from tcpdump, see bellow), so I am wondering if that address should actually be based off hostname with FQDN of a headless service, however I noticed this helm chart is setting serviceName to a value like nsq-nsqd-headless, however there is no definition for a k8s headless service.

17:35:26.661531 IP app-85ffdbcdd-kh9k7.34322 > 10-15-119-136.nsq-nsqd.nsq.svc.cluster.local.4151: Flags [P.], seq 3568384194:3568385754, ack 541490902, win 443, options [nop,nop,TS val 1232680058 ecr 4196735181], length 1560
[email protected]
.IN
.w....7..8. F~......2.....
Iy4z.%..GET /stats?channel=chan1&format=json&topic=topic1 HTTP/1.1
host: 10.15.119.136:4151

wreis avatar Jun 13 '24 12:06 wreis

@ploxiln @mreiferson

wreis avatar Jun 28 '24 13:06 wreis

@ploxiln @mreiferson Is this project still active?

wreis avatar Aug 26 '24 10:08 wreis

nsqd is not supposed to be particularly ephemeral, so it is a StatefulSet in this chart. The nsqd instances should use their PodIP as broadcast-address, because consumers should connect to multiple instances of nsqd directly, they should not go through a service to connect to a random one.

In a traditional pre-kubernetes deployment, an nsqd would run on each server, and messages of a particular topic would be published on some consistent subset of them. Then, consumers would have to connect to each member of that subset directly, to receive all messages of that topic, from everywhere they are generated.

I suppose, in a kubernetes deployment, every message is published to a random nsqd in the StatefulSet, and every consumer needs to connect to all nsqd instances in the StatefulSet directly, in order to receive all the message for a topic.

ploxiln avatar Aug 27 '24 01:08 ploxiln

nsqd is not supposed to be particularly ephemeral, so it is a StatefulSet in this chart.

The ephemeral status originates from the Kubernetes Pod lifecycle, inherently tied to the concept of container images. Pods are frequently deleted or evicted for various reasons. When a Pod is recreated, it is assigned a new IP address, which means that applications must be designed to handle these changes dynamically. This is why applications in a Kubernetes environment should be resilient, able to manage interruptions seamlessly, and adapt to these fluctuations in order to maintain service continuity and reliability. Properly implementing service discovery mechanisms and DNS resolution within the cluster is also essential to ensure that applications can find and communicate with each other despite the changing IP addresses of Pods.

The nsqd instances should use their PodIP as broadcast-address, because consumers should connect to multiple instances of nsqd directly, they should not go through a service to connect to a random one.

My proposal with this discussion is to consider the deployment environment (kubernetes) and build resilience to changes by utilizing a name lookup mechanism for locating nsqd IPs. The intention is not for consumers to connect randomly to any nsqd instance through a Service, but rather to ensure that they can dynamically discover and connect to the appropriate nsqd instances as their IPs change. This approach would allow for more robust and reliable consumer connections, even in the face of pod restarts or other changes within the cluster.

From the example I posted above, instead of consumers opening connections directly using Pod IP 10.15.119.136, it would rather use the name service lookup (through the headless Service not yet created in this helm chart) nsq-nsqd-1.nsq-nsqd-headless.nsq.svc.cluster.local FQDN, allowing applications to be resilient of nsqd Pod ephemeral status. Hence my questioning if the broadcast_address should actually return such headless service address than Pod IP.

wreis avatar Aug 27 '24 14:08 wreis

In a traditional pre-kubernetes deployment, an nsqd would run on each server, and messages of a particular topic would be published on some consistent subset of them. I suppose, in a kubernetes deployment, every message is published to a random nsqd in the StatefulSet.

I think this is a great discussion topic, although orthogonal to my original post, it is very much appreciated!

In my case, for simplicity sake and in order to achieve resilience and HA, the application engineers has setup producer configuration to publish at random nsqd by using the nsqd Service, which the StatefulSet provides Pods as Endpoints. On the other hand, this makes topics spreaded all over and avail from many different nsqd Pods requiring every consumer to connect with all nsqd Pods in order to get all messages.

In order to make sure I understand correctly, instead of that, are you stating the producers should ignore the nsqd Service and connect with consistency to specific nsqd? For instance, topic=A should always publish to nsqd-2?!

wreis avatar Aug 27 '24 14:08 wreis

My proposal with this discussion is to consider ... utilizing a name lookup mechanism for locating nsqd IPs

That's what nsqlookupd does. Consumers poll nsqlookupd periodically (every 30s by default I think), and find new nsqd IP addresses, and tolerate when existing nsqd connections are closed.

are you stating the producers should ignore the nsqd Service and connect with consistency to specific nsqd?

Producers and consumers should have different behavior. It may be quite appropriate for producers to send a message to any member of the nsqd cluster, via the Service. However the consumers need to use nsqlookupd to find all nsqd that have a topic, and connect directly to each nsqd. The nsqd broadcast-address is for consumers.

In the years before k8s, we'd typically deploy nsqd on each server that also ran applications that might generate messages, and apps would publish messages to localhost. This is the most efficient, and should always work if that particular server is working and receiving work. Because different apps will typically run on a different subset of servers, consumers don't generally have to connect to all nsqd, they only have to connect to the nsqd which are on the servers that run the apps that generate the messages on the desired topic.

NSQ is not a kubernetes-first design. It is generally flexible enough to be adapted, but the NSQ maintainers have not put together a "production-grade" "plug-and-play" "solution", this is just a community maintained thing. It is also true that NSQ development is in rather slow maintenance mode now. Some companies still use it extensively, and it works for them.

ploxiln avatar Aug 27 '24 17:08 ploxiln

That's what nsqlookupd does. Consumers poll nsqlookupd periodically (every 30s by default I think), and find new nsqd IP addresses, and tolerate when existing nsqd connections are closed.

Right, and that eventual consistency is something I am pondering on - between the time nsqlookup finds the IP, caches for the 30s TTL, and whenever the consumer(s) issue a /lookup?topic request

Actually, after reading it again, it looks like it is the consumers that might cache it for 30s, and not the nsqlookupd?

wreis avatar Aug 27 '24 18:08 wreis

Producers and consumers should have different behavior. It may be quite appropriate for producers to send a message to any member of the nsqd cluster, via the Service. However the consumers need to use nsqlookupd to find all nsqd that have a topic, and connect directly to each nsqd. The nsqd broadcast-address is for consumers.

That is exactly how I have the setup, for both producers and consumers. I was pondering on the idea of having the broadcast-address return a DNS name based on a headless StatefulSet rather than directly the Pod IP, which would allow the consumers directly connect through that name and not Pod IP, being resilient to the nsqd Pod recreations and having new IPs.

It looks like the issue might be actually on the consumer side where it is caching this Pod IP for longer-ish time, instead of relying on nsqlookupd as source of truth for an up to dated value?! Using the community clients for Python and Golang there.

wreis avatar Aug 27 '24 18:08 wreis

Actually, after reading it again, it looks like it is the consumers that might cache it for 30s, and not the nsqlookupd?

What do you mean by cache? @ploxiln is just referring to the interval at which consumers poll nsqlookupd, they don't cache anything (they do continue to stay connected to nsqd they've already successfully connected to).

mreiferson avatar Sep 01 '24 18:09 mreiferson

Actually, after reading it again, it looks like it is the consumers that might cache it for 30s, and not the nsqlookupd?

What do you mean by cache? @ploxiln is just referring to the interval at which consumers poll nsqlookupd, they don't cache anything (they do continue to stay connected to nsqd they've already successfully connected to).

Considering the consumers poll nsqlookupd periodically, that response is considered valid for the time-to-live duration (which by default looks like to be 30s) before it is refreshed. However, I noticed that in ephemeral environments like kubernetes, this info gets stale within that interval leading to consumers trying to reach nsqd which the Pod IP is different from last time it did poll nsqlookupd.

Therefore, two alternative solutions I would see:

  • broadcast_address return a FQDN based on nsqd headless Service
  • reduce the interval which consumers poll nsqlookupd to max 2-3s; what would be the implications here?

wreis avatar Sep 03 '24 13:09 wreis

reduce the interval which consumers poll nsqlookupd

You can generally set this, see for example the LookupPollInterval in https://pkg.go.dev/github.com/nsqio/go-nsq?utm_source=godoc#Config

You may also want to reduce the nsqlookupd -inactive-producer-timeout https://nsq.io/components/nsqlookupd.html#command-line-options just a bit.

The downsides are increased overhead, lots of extra requests polling for updated nsqd. For relatively small numbers of nsqd and consumers, this may be fine.

ploxiln avatar Sep 04 '24 15:09 ploxiln