logstash-kafka icon indicating copy to clipboard operation
logstash-kafka copied to clipboard

logstash kafka input consumer reblance error

Open jaywayjayway opened this issue 8 years ago • 6 comments

log4j, [2016-06-15T09:39:07.530] ERROR: kafka.consumer.ZookeeperConsumerConnector: [logstash_node1-1465954729843-dcbfad63], error during syncedRebalance
kafka.common.ConsumerRebalanceFailedException: logstash_node1-1465954729843-dcbfad63 can't rebalance after 8 retries
    at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:633)
    at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(ZookeeperConsumerConnector.scala:551)
log4j, [2016-06-15T09:39:23.632] ERROR: kafka.consumer.ZookeeperConsumerConnector: [logstash_node1-1465954729843-dcbfad63], error during syncedRebalance
kafka.common.ConsumerRebalanceFailedException: logstash_node1-1465954729843-dcbfad63 can't rebalance after 8 retries
    at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:633)
    at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(ZookeeperConsumerConnector.scala:551)

I have 3 nodes kafka cluster and 3 logstash kafka input instances .

input { 
    kafka {     
        zk_connect => "10.10.0.11:2181,10.10.0.12:2181,10.10.0.13:2181"
        topic_id => "nginx" 
        reset_beginning => true
        rebalance_max_retries => 8 
        rebalance_backoff_ms =>  2000
 }  

} 

filter {

   geoip {

    source => "remote_addr"
}


  grok {

   match => {
        "request" => "\w (?<request>.*) .*"
        }

   overwrite => ["request"]
}

}
output {
    stdout {
       codec => rubydebug 
            }
    elasticsearch { hosts => ["localhost:9200","192.168.122.12:9200","192.168.122.13:9200"] 
                    index => "logstash-nginx-%{+YYYY.MM.dd}"}
    }

toptic partition is 20, when I add new consumer or delete consumer ,all consumer will down

it will show me

Group           Topic                          Pid Offset          logSize         Lag             Owner
logstash        nginx                          0   602461          602462          1               none 
logstash        nginx                          1   375755          375756          1               none
logstash        nginx                          2   602215          602215          0               none 
logstash        nginx                          3   1007            1007            0                  none
logstash        nginx                          4   1007            1007            0                  none 
.....

I just google for it ,it told me

rebalance.max.retries * rebalance.backoff.ms > zookeeper.session.timeout.ms

when I set it ,but problems not OK .

who can help me ~~~_~~~

jaywayjayway avatar Jun 15 '16 02:06 jaywayjayway

Have you check zk for errors? Also Kafka broker logs might say something. Are you still running Logstash 1.4.X? On Jun 14, 2016 10:28 PM, "jaywayjayway" [email protected] wrote:

log4j, [2016-06-15T09:39:07.530] ERROR: kafka.consumer.ZookeeperConsumerConnector: [logstash_node1-1465954729843-dcbfad63], error during syncedRebalance kafka.common.ConsumerRebalanceFailedException: logstash_node1-1465954729843-dcbfad63 can't rebalance after 8 retries at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:633) at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(ZookeeperConsumerConnector.scala:551) log4j, [2016-06-15T09:39:23.632] ERROR: kafka.consumer.ZookeeperConsumerConnector: [logstash_node1-1465954729843-dcbfad63], error during syncedRebalance kafka.common.ConsumerRebalanceFailedException: logstash_node1-1465954729843-dcbfad63 can't rebalance after 8 retries at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:633) at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(ZookeeperConsumerConnector.scala:551)

I have 3 nodes kafka cluster and 3 logstash kafka input instances .

input { kafka { zk_connect => "10.10.0.11:2181,10.10.0.12:2181,10.10.0.13:2181" topic_id => "nginx" reset_beginning => true rebalance_max_retries => 8 rebalance_backoff_ms => 2000 }

}

filter {

geoip {

source => "remote_addr"

}

grok {

match => { "request" => "\w (?.) ." }

overwrite => ["request"] }

} output { stdout { codec => rubydebug } elasticsearch { hosts => ["localhost:9200","192.168.122.12:9200","192.168.122.13:9200"] index => "logstash-nginx-%{+YYYY.MM.dd}"} }

toptic partition is 20, when I add new consumer or delete consumer ,all consumer will down

it will show me

Group Topic Pid Offset logSize Lag Owner logstash nginx 0 602461 602462 1 none logstash nginx 1 375755 375756 1 none logstash nginx 2 602215 602215 0 none logstash nginx 3 1007 1007 0 none logstash nginx 4 1007 1007 0 none .....

I just google for it ,it told me

rebalance.max.retries * rebalance.backoff.ms > zookeeper.session.timeout.ms

when I set it ,but problems not OK . who can help me ~_~

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/joekiller/logstash-kafka/issues/57, or mute the thread https://github.com/notifications/unsubscribe/AA-bx2VSupOpgDR4IbEPr-dxhOoCMuRUks5qL2MxgaJpZM4I18L7 .

joekiller avatar Jun 15 '16 13:06 joekiller

this issue can not be solved by increase rebalance.max.retries, after i check source of consume, find logstash client will trigger rebalance by ZKTopicPartitionChangeListener and ZKRebalancerListener.

i am using logstash 2.1, my question is why logstash CAN NOT restart success?? i check kafka.rb in logstash code, found there are one option consumer_restart_on_error, default is true. it seems not work.

yzhang226 avatar Aug 22 '16 03:08 yzhang226

attache my error log4j, [2016-08-21T23:04:27.460] ERROR: kafka.consumer.ZookeeperConsumerConnector: [logstash_dada-es01-1471791305675-5932a34b], error during syncedRebalance kafka.common.ConsumerRebalanceFailedException: logstash_dada-es01-1471791305675-5932a34b can't rebalance after 50 retries at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:633) at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(ZookeeperConsumerConnector.scala:551) log4j, [2016-08-21T23:06:07.788] ERROR: kafka.consumer.ZookeeperConsumerConnector: [logstash_dada-es01-1471791305675-5932a34b], error during syncedRebalance kafka.common.ConsumerRebalanceFailedException: logstash_dada-es01-1471791305675-5932a34b can't rebalance after 50 retries at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:633) at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(ZookeeperConsumerConnector.scala:551)

yzhang226 avatar Aug 22 '16 03:08 yzhang226

I'm facing the same error. Any news about that ?

Logstash: 2.4.0 Kafka: 0.10.0.0 Zookeeper: 3.4.6

Regards.

chilcano avatar Oct 14 '16 11:10 chilcano

change "reset_beginning => true" to "reset_beginning => false" can solve it !

xiangyu123 avatar Nov 01 '16 10:11 xiangyu123

Just a message for posterity. Ran into the exact same exception while downgrading from Logstash 5.x to 2.x. In 5.x, topic config ("topics") is an array but in 2.x it's a single topic ("topic_id") represented by a string.

Because of this, my config mistakenly had an array (e.g.:' ["logstash-agent"]') for the topic_id config when it should have just been plain "logstash-agent".

Correcting this immediately resolved my issue. Hope that helps save someone pain in the future :)

cjchand avatar Dec 02 '16 16:12 cjchand