How to troubleshoot that the cluster connected to logikm cannot collect topic and other data
I use logikm to access a cluster. It can be displayed in the list of clusters under operation and maintenance control, but, when viewing cluster details, no any cluster metrics are displayed on all pages of the cluster.
At the same time, when viewing the controller information button, a zookeeper connect failed error pops up, and the console view output is as follows
GET ...../api/v1/rd/clusters/3/controller-preferred-candidates
response:
{"data":null,"message":"zookeeper connect failed","tips":null,"code":8020}
The logikm error log file log_error_2022-07-18.0.log part is as follows
2022-07-18 23:58:45.025 [pool-12-thread-9] ERROR c.x.k.m.t.s.metadata.FlushBKConsumerGroupMetadata - collect consumerGroup failed, clusterId:3.
java.lang.RuntimeException: Request METADATA failed on brokers List(xx.xxx.xx.xx:9092 (id: -2 rack: null), xx.xxx.xx.xxx:9092 (id: -1 rack: null))
at kafka.admin.AdminClient.sendAnyNode(AdminClient.scala:66)
at kafka.admin.AdminClient.findAllBrokers(AdminClient.scala:90)
at kafka.admin.AdminClient.listAllGroups(AdminClient.scala:98)
at com.xiaojukeji.kafka.manager.task.schedule.metadata.FlushBKConsumerGroupMetadata.collectAndSaveConsumerGroup(FlushBKConsumerGroupMetadata.java:80)
at com.xiaojukeji.kafka.manager.task.schedule.metadata.FlushBKConsumerGroupMetadata.flush(FlushBKConsumerGroupMetadata.java:55)
at com.xiaojukeji.kafka.manager.task.schedule.metadata.FlushBKConsumerGroupMetadata.schedule(FlushBKConsumerGroupMetadata.java:43)
at sun.reflect.GeneratedMethodAccessor104.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:84)
at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)
at org.springframework.scheduling.concurrent.ReschedulingRunnable.run(ReschedulingRunnable.java:93)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2022-07-18 23:59:25.349 [pool-12-thread-12] ERROR c.x.k.m.t.schedule.metadata.FlushTopicProperties - flush topic properties, get zk config failed, clusterId:8.
2022-07-18 23:59:35.002 [TaskThreadPool-1-179] ERROR c.x.k.m.t.s.metadata.FlushZKConsumerGroupMetadata - collect topicName and consumerGroup failed, clusterId:1 consumerGroup:monitor.metric.analyze.
com.xiaojukeji.kafka.manager.common.exception.ConfigException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /consumers/monitor.metric.analyze/offsets
at com.xiaojukeji.kafka.manager.common.zookeeper.ZkConfigImpl.getChildren(ZkConfigImpl.java:362)
at com.xiaojukeji.kafka.manager.task.schedule.metadata.FlushZKConsumerGroupMetadata$1.call(FlushZKConsumerGroupMetadata.java:95)
at com.xiaojukeji.kafka.manager.task.schedule.metadata.FlushZKConsumerGroupMetadata$1.call(FlushZKConsumerGroupMetadata.java:91)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /consumers/monitor.metric.analyze/offsets
at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1590)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:214)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:203)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:108)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl.pathInForeground(GetChildrenBuilderImpl.java:200)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:191)
at org.apache.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:38)
at com.xiaojukeji.kafka.manager.common.zookeeper.ZkConfigImpl.getChildren(ZkConfigImpl.java:360)
... 8 common frames omitted
I would like to ask why this exception occurs and look forward to your reply
logikm version: v2.6.0 This exception has been going on for a few days and be able to access the kafka cluster by other means
{"data":null,"message":"zookeeper connect failed","tips":null,"code":8020}
- The LogiKM read most Kafka Metadata from zookeeper, due to this error message, you can check the LogiKM connects to kafka zookeeper available first.
- and about the “Request METADATA failed” error log, LogiKM only support the kafka version >= 0.10.2, you can check the kafka version first.
- finally about the "KeeperErrorCode = NoNode for /consumers/monitor.metric.analyze/offsets" error log, you can check wether the zookeeper node which used to record the consumer client consumption progress existed.
without more feedback, and close the issue