pulsar-helm-chart icon indicating copy to clipboard operation
pulsar-helm-chart copied to clipboard

Do I have to enable proxy to access the broker from outside the Pulsar cluster?

Open trynocoding opened this issue 8 months ago • 22 comments

Describe the bug I would like to access Pulsar brokers from outside the Kubernetes cluster without using pulsar-proxy.

Version

4.0.3

To Reproduce Steps to reproduce the behavior:

  1. Deploy a Pulsar cluster (without proxy) using the Pulsar Operator
  2. Modify broker config to include advertisedListeners for external access
[root@master ~]# for i in `seq 0 2`; do kubectl -n pulsar-operator-system exec -it pulsarcluster-sample-broker-$i -- cat /pulsar/conf/broker.conf |egrep 'advertisedListeners|advertisedAddress|internalListenerName'|grep -v "#";doneDefaulted container "broker" out of: broker, wait-bookkeeper-ready (init)
advertisedAddress=pulsarcluster-sample-broker-0.pulsarcluster-sample-broker.pulsar-operator-system.svc.cluster.local
advertisedListeners=internal:pulsar://10.0.0.100:6650,external:pulsar://192.66.111.120:30650
internalListenerName=internal
Defaulted container "broker" out of: broker, wait-bookkeeper-ready (init)
advertisedAddress=pulsarcluster-sample-broker-1.pulsarcluster-sample-broker.pulsar-operator-system.svc.cluster.local
advertisedListeners=internal:pulsar://10.0.1.132:6650,external:pulsar://192.66.111.148:30650
internalListenerName=internal
Defaulted container "broker" out of: broker, wait-bookkeeper-ready (init)
advertisedAddress=pulsarcluster-sample-broker-2.pulsarcluster-sample-broker.pulsar-operator-system.svc.cluster.local
advertisedListeners=internal:pulsar://10.0.2.209:6650,external:pulsar://192.66.111.166:30650
internalListenerName=internal
[root@master ~]# 
  1. Run a Pulsar Go client on a host outside the cluster with:
pulsar.NewClient(pulsar.ClientOptions{
  URL: "pulsar://<NodeIP>:30650",
  ListenerName: "external",
})
  1. Observe producer behavior
[root@crazy producer]# go run producer.go 
INFO[0000] Connecting to broker                          remote_addr="pulsar://100.100.3.198:30650"
INFO[0000] TCP connection established                    local_addr="192.66.111.72:29170" remote_addr="pulsar://100.100.3.198:30650"
INFO[0000] Connection is ready                           local_addr="192.66.111.72:29170" remote_addr="pulsar://100.100.3.198:30650"
INFO[0000] Connecting to broker                          remote_addr="pulsar://192.66.111.148:30650"
INFO[0000] TCP connection established                    local_addr="192.66.111.72:46658" remote_addr="pulsar://192.66.111.148:30650"
INFO[0000] Connection is ready                           local_addr="192.66.111.72:46658" remote_addr="pulsar://192.66.111.148:30650"
INFO[0000] Connected producer                            cnx="192.66.111.72:46658 -> 192.66.111.148:30650" epoch=0 topic="persistent://public/default/test-topic"
INFO[0000] Created producer                              cnx="192.66.111.72:46658 -> 192.66.111.148:30650" producerID=1 producer_name=pulsarcluster-sample-1-3 topic="persistent://public/default/test-topic"
2025/04/15 16:55:55 Published message:  9:12:0
INFO[0000] Closing producer                              producerID=1 producer_name=pulsarcluster-sample-1-3 topic="persistent://public/default/test-topic"
INFO[0000] Closed producer                               producerID=1 producer_name=pulsarcluster-sample-1-3 topic="persistent://public/default/test-topic"
[root@crazy producer]# go run producer.go 
INFO[0000] Connecting to broker                          remote_addr="pulsar://100.100.3.198:30650"
INFO[0000] TCP connection established                    local_addr="192.66.111.72:29182" remote_addr="pulsar://100.100.3.198:30650"
INFO[0000] Connection is ready                           local_addr="192.66.111.72:29182" remote_addr="pulsar://100.100.3.198:30650"
INFO[0000] Connecting to broker                          remote_addr="pulsar://192.66.111.148:30650"
INFO[0000] TCP connection established                    local_addr="192.66.111.72:46662" remote_addr="pulsar://192.66.111.148:30650"
INFO[0000] Connection is ready                           local_addr="192.66.111.72:46662" remote_addr="pulsar://192.66.111.148:30650"
ERRO[0000] Failed to create producer at send PRODUCER request  error="server error: ServiceNotReady: Namespace bundle for topic (persistent://public/default/test-topic) not served by this instance:pulsarcluster-sample-broker-2.pulsarcluster-sample-broker.pulsar-operator-system.svc.cluster.local:8080. Please redo the lookup. Request is denied: namespace=public/default" topic="persistent://public/default/test-topic"
ERRO[0000] Failed to create producer at newPartitionProducer  error="server error: ServiceNotReady: Namespace bundle for topic (persistent://public/default/test-topic) not served by this instance:pulsarcluster-sample-broker-2.pulsarcluster-sample-broker.pulsar-operator-system.svc.cluster.local:8080. Please redo the lookup. Request is denied: namespace=public/default" topic="persistent://public/default/test-topic"
2025/04/15 16:55:57 server error: ServiceNotReady: Namespace bundle for topic (persistent://public/default/test-topic) not served by this instance:pulsarcluster-sample-broker-2.pulsarcluster-sample-broker.pulsar-operator-system.svc.cluster.local:8080. Please redo the lookup. Request is denied: namespace=public/default
exit status 1

Expected behavior We expect that by exposing broker pulsar:// port via NodePort and configuring advertisedListeners, Pulsar clients can work without proxy.

Desktop (please complete the following information):

[root@master producer]# kubectl get no -owide
NAME      STATUS   ROLES           AGE   VERSION   INTERNAL-IP      EXTERNAL-IP   OS-IMAGE          KERNEL-VERSION          CONTAINER-RUNTIME
master    Ready    control-plane   18d   v1.27.7   192.66.111.120   <none>        CentOS Stream 9   5.14.0-533.el9.x86_64   containerd://1.7.26
worker1   Ready    <none>          18d   v1.27.7   192.66.111.148   <none>        CentOS Stream 9   5.14.0-410.el9.x86_64   containerd://1.7.15
worker2   Ready    <none>          18d   v1.27.7   192.66.111.166   <none>        CentOS Stream 9   5.14.0-410.el9.x86_64   containerd://1.7.15
[root@master producer]# 

Additional context

Broker can be accessed normally from within the cluster

[root@master ~]# kubectl -n pulsar-operator-system get po
NAME                                                  READY   STATUS      RESTARTS   AGE
pulsar-operator-controller-manager-6855dffd4d-9pcgt   1/1     Running     0          14m
pulsarcluster-sample-bookie-0                         1/1     Running     0          12m
pulsarcluster-sample-bookie-1                         1/1     Running     0          11m
pulsarcluster-sample-bookie-2                         1/1     Running     0          11m
pulsarcluster-sample-broker-0                         1/1     Running     0          12m
pulsarcluster-sample-broker-1                         1/1     Running     0          11m
pulsarcluster-sample-broker-2                         1/1     Running     0          10m
pulsarcluster-sample-init-cluster-metadata-94nhk      0/1     Completed   0          12m
pulsarcluster-sample-toolset-0                        1/1     Running     0          12m
pulsarcluster-sample-zookeeper-0                      1/1     Running     0          13m
pulsarcluster-sample-zookeeper-1                      1/1     Running     0          13m
pulsarcluster-sample-zookeeper-2                      1/1     Running     0          13m

[root@master ~]# kubectl -n pulsar-operator-system get svc
NAME                                   TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)                         AGE
pulsar-operator-webhook-service        ClusterIP   10.96.2.214   <none>        443/TCP                         20m
pulsarcluster-sample-bookie            ClusterIP   None          <none>        3181/TCP                        18m
pulsarcluster-sample-broker            ClusterIP   None          <none>        8080/TCP,6650/TCP               18m
pulsarcluster-sample-broker-nodeport   NodePort    10.96.3.24    <none>        8080:32105/TCP,6650:30650/TCP   18m
pulsarcluster-sample-zookeeper         ClusterIP   None          <none>        2888/TCP,3888/TCP,2181/TCP      19m

trynocoding avatar Apr 15 '25 09:04 trynocoding

@lhotari Hello, I'm not very familiar with Pulsar. I followed the official documentation at https://pulsar.apache.org/docs/4.0.x/concepts-multiple-advertised-listeners/ to configure the broker. I'd like to ask if it's possible to access the broker from outside the cluster without enabling pulsar-proxy. Thank you for your help in advance.

trynocoding avatar Apr 15 '25 09:04 trynocoding

@trynocoding Unfortunately the pulsar-helm-chart issues aren't suitable for support questions since there's hardly any audience. Your question more or less duplicates #423 which is also a question in the pulsar-helm-chart issue tracker, without a solution. The correct place for questions is https://github.com/apache/pulsar/discussions/categories/q-a .

I'd like to ask if it's possible to access the broker from outside the cluster without enabling pulsar-proxy. Thank you for your help in advance.

It is possible, but there's no proof-of-concept in any documentation. It will most likely require changes to Apache Pulsar Helm chart as well.

One part missing from your configuration is the use of bindAddresses configuration. https://github.com/apache/pulsar/blob/066a20c33fe28ed0bb5ec9b3846ed67560877302/conf/broker.conf#L68-L69

Each advertised listener should have a unique bindAddress & port in bindAddresses so that the solution could work.

let's say something like

bindAddresses=internal:pulsar://0.0.0.0:6650,external:pulsar://0.0.0.0:16650

The node port would have to map to port 16650 in the above example.

lhotari avatar Apr 15 '25 10:04 lhotari

It looks like https://pulsar.apache.org/docs/4.0.x/concepts-multiple-advertised-listeners/ docs are missing the important piece that is about configuring the bindAddresses.

lhotari avatar Apr 15 '25 10:04 lhotari

@lhotari Sorry for raising this issue in the pulsar-helm-chart, and thank you for your answer. I really appreciate it. I will verify it later

trynocoding avatar Apr 15 '25 11:04 trynocoding

@trynocoding Unfortunately, the pulsar-helm-chart issue tracker is not suitable for support questions as it has virtually no audience. Your question more or less duplicates #423, which is also an unanswered question in the pulsar-helm-chart issue tracker. The correct place to ask questions is https://github.com/apache/pulsar/discussions/categories/q-a.

I'd like to inquire whether it's possible to access the broker from outside the cluster without enabling pulsar-proxy. Thank you in advance for your assistance.

It is possible, but there's no proof-of-concept in any documentation. It will most likely require changes to the Apache Pulsar Helm chart as well.

One part missing from your configuration is the use of the bindAddresses setting. https://github.com/apache/pulsar/blob/066a20c33fe28ed0bb5ec9b3846ed67560877302/conf/broker.conf#L68-L69

Each advertised listener should have a unique bindAddress & port in bindAddresses so that the solution could work.

let's say something like

bindAddresses=internal:pulsar://0.0.0.0:6650,external:pulsar://0.0.0.0:16650

The node port would have to map to port 16650 in the above example. In the aforementioned example, the node port needs to be mapped to port 16650.

It doesn't seem to work. Did I configure something incorrectly? One question: why is it necessary to set port 16650 (the port on the host)?

[root@master ~]# for i in `seq 0 2`; do kubectl -n pulsar-operator-system exec -it pulsarcluster-sample-broker-$i -- cat /pulsar/conf/broker.conf |egrep 'advertisedListeners|advertisedAddress|internalListenerName|bindAddresses'|grep -v "#";done
Defaulted container "broker" out of: broker, wait-bookkeeper-ready (init)
bindAddresses=internal:pulsar://0.0.0.0:6655,external:pulsar://0.0.0.0:30650
advertisedAddress=pulsarcluster-sample-broker-0.pulsarcluster-sample-broker.pulsar-operator-system.svc.cluster.local
advertisedListeners=internal:pulsar://10.0.0.34:6655,external:pulsar://192.66.111.120:30650
internalListenerName=internal
Defaulted container "broker" out of: broker, wait-bookkeeper-ready (init)
bindAddresses=internal:pulsar://0.0.0.0:6655,external:pulsar://0.0.0.0:30650
advertisedAddress=pulsarcluster-sample-broker-1.pulsarcluster-sample-broker.pulsar-operator-system.svc.cluster.local
advertisedListeners=internal:pulsar://10.0.1.208:6655,external:pulsar://192.66.111.148:30650
internalListenerName=internal
Defaulted container "broker" out of: broker, wait-bookkeeper-ready (init)
bindAddresses=internal:pulsar://0.0.0.0:6655,external:pulsar://0.0.0.0:30650
advertisedAddress=pulsarcluster-sample-broker-2.pulsarcluster-sample-broker.pulsar-operator-system.svc.cluster.local
advertisedListeners=internal:pulsar://10.0.2.186:6655,external:pulsar://192.66.111.166:30650
internalListenerName=internal
[root@master ~]# 
[root@crazy producer]# go run producer.go 
INFO[0000] Connecting to broker                          remote_addr="pulsar://100.100.3.198:30650"
INFO[0000] TCP connection established                    local_addr="192.66.111.72:60006" remote_addr="pulsar://100.100.3.198:30650"
INFO[0000] Connection is ready                           local_addr="192.66.111.72:60006" remote_addr="pulsar://100.100.3.198:30650"
INFO[0000] Connecting to broker                          remote_addr="pulsar://192.66.111.166:30650"
INFO[0000] TCP connection established                    local_addr="192.66.111.72:5498" remote_addr="pulsar://192.66.111.166:30650"
INFO[0000] Connection is ready                           local_addr="192.66.111.72:5498" remote_addr="pulsar://192.66.111.166:30650"
INFO[0000] Connected producer                            cnx="192.66.111.72:5498 -> 192.66.111.166:30650" epoch=0 topic="persistent://public/default/test-topic"
INFO[0000] Created producer                              cnx="192.66.111.72:5498 -> 192.66.111.166:30650" producerID=1 producer_name=pulsarcluster-sample-61-6 topic="persistent://public/default/test-topic"
2025/04/15 20:27:36 Published message:  33:6:0
INFO[0000] Closing producer                              producerID=1 producer_name=pulsarcluster-sample-61-6 topic="persistent://public/default/test-topic"
INFO[0000] Closed producer                               producerID=1 producer_name=pulsarcluster-sample-61-6 topic="persistent://public/default/test-topic"
[root@crazy producer]# go run producer.go 
INFO[0000] Connecting to broker                          remote_addr="pulsar://100.100.3.198:30650"
INFO[0000] TCP connection established                    local_addr="192.66.111.72:60014" remote_addr="pulsar://100.100.3.198:30650"
INFO[0000] Connection is ready                           local_addr="192.66.111.72:60014" remote_addr="pulsar://100.100.3.198:30650"
INFO[0000] Connecting to broker                          remote_addr="pulsar://192.66.111.166:30650"
INFO[0000] TCP connection established                    local_addr="192.66.111.72:5512" remote_addr="pulsar://192.66.111.166:30650"
INFO[0000] Connection is ready                           local_addr="192.66.111.72:5512" remote_addr="pulsar://192.66.111.166:30650"
ERRO[0000] Failed to create producer at send PRODUCER request  error="server error: ServiceNotReady: Namespace bundle for topic (persistent://public/default/test-topic) not served by this instance:pulsarcluster-sample-broker-0.pulsarcluster-sample-broker.pulsar-operator-system.svc.cluster.local:8080. Please redo the lookup. Request is denied: namespace=public/default" topic="persistent://public/default/test-topic"
ERRO[0000] Failed to create producer at newPartitionProducer  error="server error: ServiceNotReady: Namespace bundle for topic (persistent://public/default/test-topic) not served by this instance:pulsarcluster-sample-broker-0.pulsarcluster-sample-broker.pulsar-operator-system.svc.cluster.local:8080. Please redo the lookup. Request is denied: namespace=public/default" topic="persistent://public/default/test-topic"
2025/04/15 20:27:38 server error: ServiceNotReady: Namespace bundle for topic (persistent://public/default/test-topic) not served by this instance:pulsarcluster-sample-broker-0.pulsarcluster-sample-broker.pulsar-operator-system.svc.cluster.local:8080. Please redo the lookup. Request is denied: namespace=public/default
exit status 1
[root@crazy producer]# 

trynocoding avatar Apr 15 '25 12:04 trynocoding

It doesn't seem to work. Did I configure something incorrectly? One question: why is it necessary to set port 16650 (the port on the host)?

It can be any available port on the pod. Each advertised listener in advertisedListeners should have a unique bind address in the bindAddresses. Each listening port maps to a single "listener". This makes the Pulsar lookup return the advertised address of the active "listener" for a particular bind address port.

The docs at https://pulsar.apache.org/docs/4.0.x/concepts-multiple-advertised-listeners/ are misleading since there shouldn't be a need to set any listener name in the client itself. That's something that shouldn't even be exposed to clients and it seems that that's how it was initially implemented, but revisited later. The docs haven't been updated to reflect the correct usage.

lhotari avatar Apr 15 '25 13:04 lhotari

When configuring the nodeport, you'd have to ensure that each broker pod can be individually and uniquely addressed. That might be the problem in your configuration. The current Helm chart doesn't have ways to do this and you'd have to modify the Helm chart to generate a service for each broker pod.

lhotari avatar Apr 15 '25 14:04 lhotari

it would also be useful to have a separate nodeport service that has all available brokers in it. that's what should be used for the service url in clients.

in summary:

  • for each broker pod there has to be a separate nodeport service mapping to the the port where the "external" listener is bound
  • a shared nodeport service which contains all available broker pods mapping to the "external" listener port. this should be used in clients.

lhotari avatar Apr 15 '25 14:04 lhotari

@lhotari Thank you, following above method, I can now access it outside the cluster

trynocoding avatar Apr 16 '25 03:04 trynocoding

@lhotari Thank you, following above method, I can now access it outside the cluster

@trynocoding would you be interested in contributing the changes to pulsar-helm-chart or sharing how you have achieved this? That could help others wishing to configure their Pulsar cluster in a similar way.

lhotari avatar Apr 16 '25 07:04 lhotari

@lhotari Thank you, following above method, I can now access it outside the cluster

@trynocoding would you be interested in contributing the changes to pulsar-helm-chart or sharing how you have achieved this? That could help others wishing to configure their Pulsar cluster in a similar way.

@lhotari At first, I only exposed the Pulsar protocol externally, which allowed normal access,the configuration is as follows

[root@master samples]# for i in `seq 0 2`; do kubectl -n pulsar-operator-system exec -it pulsarcluster-sample-broker-$i -- cat /pulsar/conf/broker.conf |egrep 'advertisedListeners|advertisedAddress|internalListenerName|bindAddresses|brokerServiceURL'|grep -v "#";done
Defaulted container "broker" out of: broker, wait-bookkeeper-ready (init)
bindAddresses=
advertisedAddress=pulsarcluster-sample-broker-0.pulsarcluster-sample-broker.pulsar-operator-system.svc.cluster.local
advertisedListeners=internal:pulsar://10.0.0.15:6650,external:pulsar://192.66.111.120:32566
internalListenerName=internal
Defaulted container "broker" out of: broker, wait-bookkeeper-ready (init)
bindAddresses=
advertisedAddress=pulsarcluster-sample-broker-1.pulsarcluster-sample-broker.pulsar-operator-system.svc.cluster.local
advertisedListeners=internal:pulsar://10.0.0.98:6650,external:pulsar://192.66.111.120:32386
internalListenerName=internal
Defaulted container "broker" out of: broker, wait-bookkeeper-ready (init)
bindAddresses=
advertisedAddress=pulsarcluster-sample-broker-2.pulsarcluster-sample-broker.pulsar-operator-system.svc.cluster.local
advertisedListeners=internal:pulsar://10.0.0.148:6650,external:pulsar://192.66.111.120:31984
internalListenerName=internal
[root@master samples]# 

When both Pulsar and HTTP protocols are exposed externally, accessing the broker via the HTTP protocol from outside the cluster results in an error,the configuration is as follows

[root@master samples]# for i in `seq 0 2`; do kubectl -n pulsar-operator-system exec -it pulsarcluster-sample-broker-$i -- cat /pulsar/conf/broker.conf |egrep 'advertisedListeners|advertisedAddress|internalListenerName|bindAddresses|brokerServiceURL'|grep -v "#";done
Defaulted container "broker" out of: broker, wait-bookkeeper-ready (init)
bindAddresses=
advertisedAddress=pulsarcluster-sample-broker-0.pulsarcluster-sample-broker.pulsar-operator-system.svc.cluster.local
advertisedListeners=internal:pulsar://10.0.0.15:6650,external:pulsar://192.66.111.120:32566,internal:http://10.0.0.15:8080,external:http://192.66.111.120:31896
internalListenerName=internal
Defaulted container "broker" out of: broker, wait-bookkeeper-ready (init)
bindAddresses=
advertisedAddress=pulsarcluster-sample-broker-1.pulsarcluster-sample-broker.pulsar-operator-system.svc.cluster.local
advertisedListeners=internal:pulsar://10.0.0.98:6650,external:pulsar://192.66.111.120:32386,internal:http://10.0.0.98:8080,external:http://192.66.111.120:31598
internalListenerName=internal
Defaulted container "broker" out of: broker, wait-bookkeeper-ready (init)
bindAddresses=
advertisedAddress=pulsarcluster-sample-broker-2.pulsarcluster-sample-broker.pulsar-operator-system.svc.cluster.local
advertisedListeners=internal:pulsar://10.0.0.148:6650,external:pulsar://192.66.111.120:31984,internal:http://10.0.0.148:8080,external:http://192.66.111.120:31741
internalListenerName=internal
[root@master samples]# 

[root@master samples]# kubectl -n pulsar-operator-system get svc
NAME                                     TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)                         AGE
pulsar-operator-webhook-service          ClusterIP   10.96.1.9     <none>        443/TCP                         8m
pulsarcluster-sample-bookie              ClusterIP   None          <none>        3181/TCP                        6m22s
pulsarcluster-sample-broker              ClusterIP   None          <none>        8080/TCP,6650/TCP               6m22s
pulsarcluster-sample-broker-0-external   NodePort    10.96.2.144   <none>        8080:31896/TCP,6650:32566/TCP   6m22s
pulsarcluster-sample-broker-1-external   NodePort    10.96.3.114   <none>        8080:31598/TCP,6650:32386/TCP   6m22s
pulsarcluster-sample-broker-2-external   NodePort    10.96.3.19    <none>        8080:31741/TCP,6650:31984/TCP   6m22s
pulsarcluster-sample-broker-nodeport     NodePort    10.96.0.178   <none>        8080:30370/TCP,6650:30650/TCP   6m22s
pulsarcluster-sample-zookeeper           ClusterIP   None          <none>        2888/TCP,3888/TCP,2181/TCP      7m45s

Using the http protocol, the client error message is as follows

        client, err := pulsar.NewClient(pulsar.ClientOptions{
                URL:            "http://100.100.3.198:30370",
                ListenerName:   "external",
        })


[root@crazy producer]# go run http_producer.go 
INFO[0000] Connecting to broker                          remote_addr="pulsar://10.0.0.148:6650"
WARN[0004] Failed to connect to broker.                  error="dial tcp 10.0.0.148:6650: connect: connection timed out" remote_addr="pulsar://10.0.0.148:6650"
ERRO[0004] Failed to get connection                      topic="persistent://public/default/test-topic"
ERRO[0004] Failed to create producer at newPartitionProducer  error="connection error" topic="persistent://public/default/test-topic"
2025/04/16 16:10:47 connection error
exit status 1

I don't know why, but when accessed via HTTP protocol, it returns the pod's IP and the Pulsar protocol's port

Using the pulsar protocol, the client is fine

        client, err := pulsar.NewClient(pulsar.ClientOptions{
                URL:            "pulsar://100.100.3.198:30650",
                ListenerName:   "external",
        })

[root@crazy producer]# go run producer.go 
INFO[0000] Connecting to broker                          remote_addr="pulsar://100.100.3.198:30650"
INFO[0000] TCP connection established                    local_addr="192.66.111.72:32768" remote_addr="pulsar://100.100.3.198:30650"
INFO[0000] Connection is ready                           local_addr="192.66.111.72:32768" remote_addr="pulsar://100.100.3.198:30650"
INFO[0000] Connecting to broker                          remote_addr="pulsar://192.66.111.120:31984"
INFO[0000] TCP connection established                    local_addr="192.66.111.72:5568" remote_addr="pulsar://192.66.111.120:31984"
INFO[0000] Connection is ready                           local_addr="192.66.111.72:5568" remote_addr="pulsar://192.66.111.120:31984"
INFO[0000] Connected producer                            cnx="192.66.111.72:5568 -> 192.66.111.120:31984" epoch=0 topic="persistent://public/default/test-topic"
INFO[0000] Created producer                              cnx="192.66.111.72:5568 -> 192.66.111.120:31984" producerID=1 producer_name=pulsarcluster-sample-2-307 topic="persistent://public/default/test-topic"
2025/04/16 16:38:01 Published message:  6:307:0
INFO[0000] Closing producer                              producerID=1 producer_name=pulsarcluster-sample-2-307 topic="persistent://public/default/test-topic"
INFO[0000] Closed producer                               producerID=1 producer_name=pulsarcluster-sample-2-307 topic="persistent://public/default/test-topic"
[root@crazy producer]# 

trynocoding avatar Apr 16 '25 08:04 trynocoding

At first, I only exposed the Pulsar protocol externally, which allowed normal access,the configuration is as follows

@trynocoding Just curious, did you modify the Helm chart to configure the cluster in this way? If not, how did you handle the configuration?

lhotari avatar Apr 16 '25 08:04 lhotari

At first, I only exposed the Pulsar protocol externally, which allowed normal access. The configuration is as follows:

@trynocoding Just curious, did you modify the Helm chart to configure the cluster in this way? If not, how did you handle the configuration?

I deployed the cluster using pulsar-operator (currently under development) and did not use the helm chart

trynocoding avatar Apr 16 '25 09:04 trynocoding

I deployed the cluster using pulsar-operator (currently under development) and did not use the helm chart

@trynocoding Ok, I see. Is it going to be opensource or is it specific to your company?

lhotari avatar Apr 16 '25 09:04 lhotari

@lhotari Currently, it is being developed for the company, and whether it will be open-sourced may depend on the company's decision

trynocoding avatar Apr 16 '25 09:04 trynocoding

@lhotari

Using the http protocol, the client error message is as follows

client, err := pulsar.NewClient(pulsar.ClientOptions{ URL: "http://100.100.3.198:30370", ListenerName: "external", })

[root@crazy producer]# go run http_producer.go INFO[0000] Connecting to broker remote_addr="pulsar://10.0.0.148:6650" WARN[0004] Failed to connect to broker. error="dial tcp 10.0.0.148:6650: connect: connection timed out" remote_addr="pulsar://10.0.0.148:6650" ERRO[0004] Failed to get connection topic="persistent://public/default/test-topic" ERRO[0004] Failed to create producer at newPartitionProducer error="connection error" topic="persistent://public/default/test-topic" 2025/04/16 16:10:47 connection error exit status 1

Here I made a mistake—the HTTP protocol cannot be used for data streams (producing and consuming messages); it is only intended for management streams (creating topics, checking cluster status, etc.). Therefore, to verify whether HTTP is exposed outside the cluster, simply access it using the curl command.

[root@crazy producer]# curl http://100.100.3.198:30370/admin/v2/clusters
["pulsarcluster-sample"][root@crazy producer]#
[root@crazy producer]#

As follows, within the cluster,the broker returns the internal access address.

        client, err := pulsar.NewClient(pulsar.ClientOptions{
                URL:            "http://100.100.3.198:30370",
        })

[root@master producer]# go run http.go 
INFO[0000] Connecting to broker                          remote_addr="pulsar://10.0.0.148:6650"
INFO[0000] TCP connection established                    local_addr="10.0.0.202:36224" remote_addr="pulsar://10.0.0.148:6650"
INFO[0000] Connection is ready                           local_addr="10.0.0.202:36224" remote_addr="pulsar://10.0.0.148:6650"
INFO[0000] Connected producer                            cnx="10.0.0.202:36224 -> 10.0.0.148:6650" epoch=0 topic="persistent://public/default/test-topic"
INFO[0000] Created producer                              cnx="10.0.0.202:36224 -> 10.0.0.148:6650" producerID=1 producer_name=pulsarcluster-sample-2-315 topic="persistent://public/default/test-topic"
2025/04/16 19:17:04 Published message:  6:315:0
INFO[0000] Closing producer                              producerID=1 producer_name=pulsarcluster-sample-2-315 topic="persistent://public/default/test-topic"
INFO[0000] Closed producer                               producerID=1 producer_name=pulsarcluster-sample-2-315 topic="persistent://public/default/test-topic"

As follows, outside the cluster,the broker still returns the internal access address. I would expect the broker to return an externally accessible address for the pulsar protocol, not an invalid internal address for the cluster. I don't know if I've misconfigured it, or if that's how pulsar is designed.

        client, err := pulsar.NewClient(pulsar.ClientOptions{
                URL:            "http://100.100.3.198:30370",
                ListenerName:   "external",
        })

[root@crazy producer]# go run http.go 
INFO[0000] Connecting to broker                          remote_addr="pulsar://10.0.0.148:6650"
WARN[0004] Failed to connect to broker.                  error="dial tcp 10.0.0.148:6650: connect: connection timed out" remote_addr="pulsar://10.0.0.148:6650"
ERRO[0004] Failed to get connection                      topic="persistent://public/default/test-topic"
ERRO[0004] Failed to create producer at newPartitionProducer  error="connection error" topic="persistent://public/default/test-topic"
2025/04/16 19:24:34 connection error
exit status 1

trynocoding avatar Apr 16 '25 09:04 trynocoding

As follows, outside the cluster,the broker still returns the internal access address. I would expect the broker to return an externally accessible address for the pulsar protocol, not an invalid internal address for the cluster. I don't know if I've misconfigured it, or if that's how pulsar is designed.

There are some bugs/gaps with the http protocol and external listeners. I'd just recommend using Pulsar Proxy (or a generic reverse proxy like nginx) for admin API and not using http/https as the service URL at all so that it doesn't get used for lookups. For a generic reverse proxy, there would have to be some logic to follow redirects for internal addresses before returning responses to clients. That logic exists in the Pulsar Proxy already so it might be the easiest choice.

lhotari avatar Apr 16 '25 11:04 lhotari

One of the gaps: https://github.com/apache/pulsar/pull/22062

lhotari avatar Apr 16 '25 11:04 lhotari

@lhotari extremely grateful

trynocoding avatar Apr 16 '25 12:04 trynocoding

docs have been updated some time ago: https://pulsar.apache.org/docs/4.0.x/concepts-multiple-advertised-listeners

lhotari avatar May 30 '25 15:05 lhotari

You can also use an NLB if I am not mistaken. We are about to start using it using an NLB exposed internally so that other services can reach it from outside k8s. You would just need to bind the right service.

oren-cohen avatar Sep 01 '25 00:09 oren-cohen

You can also use an NLB if I am not mistaken. We are about to start using it using an NLB exposed internally so that other services can reach it from outside k8s. You would just need to bind the right service.

@oren-cohen Yes you can and that requires the Pulsar Proxy with apache-pulsar-helmchart, however the context of this issue is "I would like to access Pulsar brokers from outside the Kubernetes cluster without using pulsar-proxy". Apache Pulsar Helmchart doesn't currently provide other options than using the Pulsar Proxy.

lhotari avatar Sep 16 '25 14:09 lhotari