kubeblocks
kubeblocks copied to clipboard
[BUG] pulsar cluster pulsar-proxy crash and bookies-recovery always init create serviceRefs zookeeper cluster
Describe the bug A clear and concise description of what the bug is.
To Reproduce Steps to reproduce the behavior:
- create zk cluster
apiVersion: apps.kubeblocks.io/v1alpha1
kind: Cluster
metadata:
name: zookeeperp-cluster
namespace: default
spec:
clusterDefinitionRef: pulsar-zookeeper
clusterVersionRef: pulsar-3.0.2
terminationPolicy: Delete
affinity:
podAntiAffinity: Preferred
topologyKeys:
- kubernetes.io/hostname
tenancy: SharedNode
tolerations:
- key: kb-data
operator: Equal
value: "true"
effect: NoSchedule
componentSpecs:
- name: zookeeper
componentDefRef: zookeeper
monitor: false
replicas: 3
resources:
limits:
cpu: "0.5"
memory: "0.5Gi"
requests:
cpu: "0.5"
memory: "0.5Gi"
volumeClaimTemplates:
- name: data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
- create pulsar cluster
apiVersion: apps.kubeblocks.io/v1alpha1
kind: Cluster
metadata:
labels:
clusterdefinition.kubeblocks.io/name: pulsar
clusterversion.kubeblocks.io/name: pulsar-3.0.2
name: pulsar-cluster
namespace: default
spec:
clusterDefinitionRef: pulsar
clusterVersionRef: pulsar-3.0.2
componentSpecs:
- componentDefRef: pulsar-broker
monitor: false
name: pulsar-broker
replicas: 3
resources:
limits:
cpu: "0.5"
memory: 0.5Gi
requests:
cpu: "0.5"
memory: 0.5Gi
serviceAccountName: kb-pulsar-cluster
serviceRefs:
- cluster: zookeeperp-cluster
name: pulsarZookeeper
namespace: default
volumeClaimTemplates:
- name: data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
- componentDefRef: pulsar-proxy
monitor: true
name: pulsar-proxy
replicas: 1
resources:
limits:
cpu: "0.5"
memory: 0.5Gi
requests:
cpu: "0.5"
memory: 0.5Gi
serviceRefs:
- cluster: zookeeperp-cluster
name: pulsarZookeeper
namespace: default
volumeClaimTemplates:
- name: data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
- componentDefRef: bookies
monitor: true
name: bookies
replicas: 3
resources:
limits:
cpu: "0.5"
memory: 0.5Gi
requests:
cpu: "0.5"
memory: 0.5Gi
serviceRefs:
- cluster: zookeeperp-cluster
name: pulsarZookeeper
namespace: default
volumeClaimTemplates:
- name: journal
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
- name: ledgers
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
- componentDefRef: bookies-recovery
monitor: true
name: bookies-recovery
replicas: 1
resources:
limits:
cpu: "0.5"
memory: 0.5Gi
requests:
cpu: "0.5"
memory: 0.5Gi
serviceRefs:
- cluster: zookeeperp-cluster
name: pulsarZookeeper
namespace: default
volumeClaimTemplates:
- name: data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
services:
- componentSelector: proxy
name: proxy
serviceName: proxy
spec:
ports:
- name: pulsar
port: 6650
protocol: TCP
targetPort: 6650
- name: http
port: 80
protocol: TCP
targetPort: 8080
type: ClusterIP
- componentSelector: broker
name: broker-bootstrap
serviceName: broker-bootstrap
spec:
ports:
- name: pulsar
port: 6650
protocol: TCP
targetPort: 6650
- name: http
port: 80
protocol: TCP
targetPort: 8080
- name: kafka-client
port: 9092
protocol: TCP
targetPort: 9092
type: ClusterIP
terminationPolicy: Delete
tolerations:
- effect: NoSchedule
key: kb-data
operator: Equal
value: "true"
- See error
kubectl get pod
NAME READY STATUS RESTARTS AGE
pulsar-cluster-bookies-0 2/2 Running 0 9m47s
pulsar-cluster-bookies-1 2/2 Running 0 9m47s
pulsar-cluster-bookies-2 2/2 Running 0 9m47s
pulsar-cluster-bookies-recovery-0 0/2 Init:0/1 0 9m51s
pulsar-cluster-pulsar-broker-0 3/3 Running 0 9m46s
pulsar-cluster-pulsar-broker-1 3/3 Running 0 9m46s
pulsar-cluster-pulsar-broker-2 3/3 Running 0 9m46s
pulsar-cluster-pulsar-proxy-0 1/2 CrashLoopBackOff 5 (108s ago) 9m50s
zookeeperp-cluster-zookeeper-0 2/2 Running 0 9m52s
zookeeperp-cluster-zookeeper-1 2/2 Running 0 9m52s
zookeeperp-cluster-zookeeper-2 2/2 Running 0 9m52s
logs CrashLoopBackOff pod pulsar-proxy serviceRefs not effective zk endpoint "pulsar-cluster-zookeeper.default.svc:2181"
kubectl logs pulsar-cluster-pulsar-proxy-0 proxy --tail 30
[conf/proxy.conf] Updating config statusFilePath=/pulsar/status
[conf/proxy.conf] Adding config: maxMessageSize=5242880
[conf/proxy.conf] Applying config brokerServiceURL = pulsar://pulsar-cluster-pulsar-broker:6650
[conf/proxy.conf] Applying config brokerWebServiceURL = http://pulsar-cluster-pulsar-broker:80
[conf/proxy.conf] Applying config clusterName = default-pulsar-cluster-pulsar-proxy
[conf/proxy.conf] Applying config metadataStoreUrl = pulsar-cluster-zookeeper.default.svc:2181
[conf/proxy.conf] Applying config webServicePort = 8080
VM settings:
Max. Heap Size (Estimated): 154.00M
Using VM: OpenJDK 64-Bit Server VM
2024-04-12T07:25:10,862+0000 [main] INFO org.apache.pulsar.broker.authentication.AuthenticationService - Authentication is disabled
2024-04-12T07:25:11,360+0000 [main] INFO org.apache.pulsar.proxy.extensions.ProxyExtensionsUtils - Searching for extensions in /pulsar/./proxyextensions
2024-04-12T07:25:11,360+0000 [main] WARN org.apache.pulsar.proxy.extensions.ProxyExtensionsUtils - extension directory not found
2024-04-12T07:25:11,456+0000 [main] INFO org.eclipse.jetty.util.log - Logging initialized @4392ms to org.eclipse.jetty.util.log.Slf4jLog
2024-04-12T07:25:11,761+0000 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:zookeeper.version=3.8.3-6ad6d364c7c0bcf0de452d54ebefa3058098ab56, built on 2023-10-05 10:34 UTC
2024-04-12T07:25:11,761+0000 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:host.name=pulsar-cluster-pulsar-proxy-0.pulsar-cluster-pulsar-proxy-headless.default.svc.cluster.local
2024-04-12T07:25:11,761+0000 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.version=17.0.7
2024-04-12T07:25:11,761+0000 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.vendor=Debian
2024-04-12T07:25:11,761+0000 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.home=/usr/lib/jvm/java-17-openjdk-arm64
...
at java.net.InetAddress$CachedAddresses.get(InetAddress.java:801) ~[?:?]
at java.net.InetAddress.getAllByName0(InetAddress.java:1533) ~[?:?]
at java.net.InetAddress.getAllByName(InetAddress.java:1385) ~[?:?]
at java.net.InetAddress.getAllByName(InetAddress.java:1306) ~[?:?]
at org.apache.zookeeper.client.StaticHostProvider$1.getAllByName(StaticHostProvider.java:88) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.client.StaticHostProvider.resolve(StaticHostProvider.java:141) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.client.StaticHostProvider.next(StaticHostProvider.java:368) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1204) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
2024-04-12T07:22:13,061+0000 [main-SendThread(pulsar-cluster-zookeeper.default.svc:2181)] WARN org.apache.zookeeper.ClientCnxn - Session 0x0 for server pulsar-cluster-zookeeper.default.svc/<unresolved>:2181, Closing socket connection. Attempting reconnect except it is a SessionExpiredException.
java.lang.IllegalArgumentException: Unable to canonicalize address pulsar-cluster-zookeeper.default.svc/<unresolved>:2181 because it's not resolvable
at org.apache.zookeeper.SaslServerPrincipal.getServerPrincipal(SaslServerPrincipal.java:78) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.SaslServerPrincipal.getServerPrincipal(SaslServerPrincipal.java:41) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:1157) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1207) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
2024-04-12T07:22:14,163+0000 [main-SendThread(pulsar-cluster-zookeeper.default.svc:2181)] ERROR org.apache.zookeeper.client.StaticHostProvider - Unable to resolve address: pulsar-cluster-zookeeper.default.svc/<unresolved>:2181
java.net.UnknownHostException: pulsar-cluster-zookeeper.default.svc
at java.net.InetAddress$CachedAddresses.get(InetAddress.java:801) ~[?:?]
at java.net.InetAddress.getAllByName0(InetAddress.java:1533) ~[?:?]
at java.net.InetAddress.getAllByName(InetAddress.java:1385) ~[?:?]
at java.net.InetAddress.getAllByName(InetAddress.java:1306) ~[?:?]
at org.apache.zookeeper.client.StaticHostProvider$1.getAllByName(StaticHostProvider.java:88) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.client.StaticHostProvider.resolve(StaticHostProvider.java:141) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.client.StaticHostProvider.next(StaticHostProvider.java:368) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1204) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
2024-04-12T07:22:14,163+0000 [main-SendThread(pulsar-cluster-zookeeper.default.svc:2181)] WARN org.apache.zookeeper.ClientCnxn - Session 0x0 for server pulsar-cluster-zookeeper.default.svc/<unresolved>:2181, Closing socket connection. Attempting reconnect except it is a SessionExpiredException.
java.lang.IllegalArgumentException: Unable to canonicalize address pulsar-cluster-zookeeper.default.svc/<unresolved>:2181 because it's not resolvable
at org.apache.zookeeper.SaslServerPrincipal.getServerPrincipal(SaslServerPrincipal.java:78) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.SaslServerPrincipal.getServerPrincipal(SaslServerPrincipal.java:41) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:1157) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1207) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
logs bookies-recovery
kubectl logs pulsar-cluster-bookies-recovery-0 check-bookies --tail 30
+ bin/apply-config-from-env.py conf/bookkeeper.conf
[conf/bookkeeper.conf] Applying config httpServerEnabled = true
[conf/bookkeeper.conf] Applying config httpServerPort = 8000
[conf/bookkeeper.conf] Applying config lostBookieRecoveryDelay = 300
[conf/bookkeeper.conf] Applying config prometheusStatsHttpPort = 8000
[conf/bookkeeper.conf] Applying config useHostNameAsBookieID = true
[conf/bookkeeper.conf] Applying config zkServers = pulsar-cluster-zookeeper.default.svc:2181
+ bin/bookkeeper shell whatisinstanceid
JAVA_HOME not set, using java from PATH. (/usr/bin/java)
[0.004s][trace][gc,heap] Maximum heap size 7326386176
[0.004s][trace][gc,heap] Initial heap size 228949568
[0.004s][trace][gc,heap] Minimum heap size 6815736
[0.005s][debug][gc,heap] Minimum heap 8388608 Initial heap 230686720 Maximum heap 7327449088
[0.005s][info ][gc ] Using G1
[0.010s][info ][gc,init] Version: 17.0.7+7-Debian-1deb11u1 (release)
[0.010s][info ][gc,init] CPUs: 7 total, 7 available
[0.010s][info ][gc,init] Memory: 13973M
[0.010s][info ][gc,init] Large Page Support: Disabled
[0.010s][info ][gc,init] NUMA Support: Disabled
[0.010s][info ][gc,init] Compressed Oops: Enabled (Zero based)
[0.010s][info ][gc,init] Heap Region Size: 4M
[0.010s][info ][gc,init] Heap Min Capacity: 8M
[0.010s][info ][gc,init] Heap Initial Capacity: 220M
[0.010s][info ][gc,init] Heap Max Capacity: 6988M
[0.010s][info ][gc,init] Pre-touch: Disabled
[0.010s][info ][gc,init] Parallel Workers: 4
[0.010s][info ][gc,init] Concurrent Workers: 4
[0.010s][info ][gc,init] Concurrent Refinement Workers: 4
[0.010s][info ][gc,init] Periodic GC: Disabled
[0.010s][info ][gc,metaspace] CDS archive(s) mapped at: [0x0000000800000000-0x0000000800be2000-0x0000000800be2000), size 12460032, SharedBaseAddress: 0x0000000800000000, ArchiveRelocationMode: 0.
[0.010s][info ][gc,metaspace] Compressed class space mapped at: 0x0000000801000000-0x0000000841000000, reserved size: 1073741824
[0.010s][info ][gc,metaspace] Narrow klass base: 0x0000000800000000, Narrow klass shift: 0, Narrow klass range: 0x100000000
[0.296s][info ][safepoint ] Safepoint "ICBufferFull", Time since last: 277307959 ns, Reaching safepoint: 500625 ns, At safepoint: 16000 ns, Total: 516625 ns
[0.554s][info ][safepoint ] Safepoint "ICBufferFull", Time since last: 257106792 ns, Reaching safepoint: 305875 ns, At safepoint: 18291 ns, Total: 324166 ns
[0.823s][info ][safepoint ] Safepoint "ICBufferFull", Time since last: 268665625 ns, Reaching safepoint: 221542 ns, At safepoint: 4375 ns, Total: 225917 ns
[0.978s][info ][safepoint ] Safepoint "ICBufferFull", Time since last: 155170958 ns, Reaching safepoint: 212792 ns, At safepoint: 8792 ns, Total: 221584 ns
2024-04-12T07:14:39,428+0000 [main] INFO org.apache.bookkeeper.meta.MetadataDrivers - BookKeeper metadata driver manager initialized
2024-04-12T07:14:39,457+0000 [main] INFO org.apache.bookkeeper.meta.zk.ZKMetadataDriverBase - Initialize zookeeper metadata driver at metadata service uri zk+null://pulsar-cluster-zookeeper.default.svc:2181/ledgers : zkServers = pulsar-cluster-zookeeper.default.svc:2181, ledgersRootPath = /ledgers.
2024-04-12T07:14:39,476+0000 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:zookeeper.version=3.8.3-6ad6d364c7c0bcf0de452d54ebefa3058098ab56, built on 2023-10-05 10:34 UTC
2024-04-12T07:14:39,477+0000 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:host.name=pulsar-cluster-bookies-recovery-0.pulsar-cluster-bookies-recovery-headless.default.svc.cluster.local
2024-04-12T07:14:39,478+0000 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.version=17.0.7
2024-04-12T07:14:39,478+0000 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.vendor=Debian
2024-04-12T07:14:39,478+0000 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.home=/usr/lib/jvm/java-17-openjdk-arm64
...
at java.net.InetAddress$CachedAddresses.get(InetAddress.java:801) ~[?:?]
at java.net.InetAddress.getAllByName0(InetAddress.java:1533) ~[?:?]
at java.net.InetAddress.getAllByName(InetAddress.java:1385) ~[?:?]
at java.net.InetAddress.getAllByName(InetAddress.java:1306) ~[?:?]
at org.apache.zookeeper.client.StaticHostProvider$1.getAllByName(StaticHostProvider.java:88) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.client.StaticHostProvider.resolve(StaticHostProvider.java:141) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.client.StaticHostProvider.next(StaticHostProvider.java:368) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1204) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
2024-04-12T07:26:34,013+0000 [main-SendThread(pulsar-cluster-zookeeper.default.svc:2181)] WARN org.apache.zookeeper.ClientCnxn - Session 0x0 for server pulsar-cluster-zookeeper.default.svc/<unresolved>:2181, Closing socket connection. Attempting reconnect except it is a SessionExpiredException.
java.lang.IllegalArgumentException: Unable to canonicalize address pulsar-cluster-zookeeper.default.svc/<unresolved>:2181 because it's not resolvable
at org.apache.zookeeper.SaslServerPrincipal.getServerPrincipal(SaslServerPrincipal.java:78) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.SaslServerPrincipal.getServerPrincipal(SaslServerPrincipal.java:41) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:1157) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1207) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
2024-04-12T07:26:35,114+0000 [main-SendThread(pulsar-cluster-zookeeper.default.svc:2181)] ERROR org.apache.zookeeper.client.StaticHostProvider - Unable to resolve address: pulsar-cluster-zookeeper.default.svc/<unresolved>:2181
java.net.UnknownHostException: pulsar-cluster-zookeeper.default.svc
at java.net.InetAddress$CachedAddresses.get(InetAddress.java:801) ~[?:?]
at java.net.InetAddress.getAllByName0(InetAddress.java:1533) ~[?:?]
at java.net.InetAddress.getAllByName(InetAddress.java:1385) ~[?:?]
at java.net.InetAddress.getAllByName(InetAddress.java:1306) ~[?:?]
at org.apache.zookeeper.client.StaticHostProvider$1.getAllByName(StaticHostProvider.java:88) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.client.StaticHostProvider.resolve(StaticHostProvider.java:141) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.client.StaticHostProvider.next(StaticHostProvider.java:368) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1204) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
2024-04-12T07:26:35,114+0000 [main-SendThread(pulsar-cluster-zookeeper.default.svc:2181)] WARN org.apache.zookeeper.ClientCnxn - Session 0x0 for server pulsar-cluster-zookeeper.default.svc/<unresolved>:2181, Closing socket connection. Attempting reconnect except it is a SessionExpiredException.
java.lang.IllegalArgumentException: Unable to canonicalize address pulsar-cluster-zookeeper.default.svc/<unresolved>:2181 because it's not resolvable
at org.apache.zookeeper.SaslServerPrincipal.getServerPrincipal(SaslServerPrincipal.java:78) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.SaslServerPrincipal.getServerPrincipal(SaslServerPrincipal.java:41) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:1157) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1207) ~[org.apache.zookeeper-zookeeper-3.8.3.jar:3.8.3]
Expected behavior A clear and concise description of what you expected to happen.
Screenshots If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
- OS: [e.g. iOS]
- Browser [e.g. chrome, safari]
- Version [e.g. 22]
Additional context Add any other context about the problem here.
This issue has been marked as stale because it has been open for 30 days with no activity