ansible-zookeeper icon indicating copy to clipboard operation
ansible-zookeeper copied to clipboard

Incorrect cfg for clustering?

Open ken-tune opened this issue 2 years ago • 2 comments

First - thanks for putting together this great resource.

I've found that if I try and 'cluster' by doing something like

  • name: "Kafka" hosts: "{{ kafka_tag }}" remote_user: "{{ os_config['remote_user'] }}" become: yes vars: kafka_version: "2.8.1" zookeeper_servers: "{{ groups[kafka_tag] }}" zookeeper_id: "{{ groups[kafka_tag].index(inventory_hostname) + 1}}"
    vars_files:
    • vars/cluster-config.yml
    • vars/constants.yml
    • vars/os-level-config.yml
      roles:
      • sleighzy.kafka
      • sleighzy.zookeeper

the zoo.cfg ends up as

server.1=ip-10-0-0-65.ec2.internal:2888:3888 server.1=ip-10-0-1-234.ec2.internal:2888:3888 server.1=ip-10-0-2-207.ec2.internal:2888:3888

whereas it should be something like

server.1=ip-10-0-0-65.ec2.internal:2888:3888 server.2=ip-10-0-1-234.ec2.internal:2888:3888 server.3=ip-10-0-2-207.ec2.internal:2888:3888

otherwise all the servers end up in standalone mode.

See https://mail-archives.apache.org/mod_mbox/zookeeper-user/202111.mbox/browser

I don't know if this is easily fixable? I can try and look into it if you like.

ken-tune avatar Nov 04 '21 17:11 ken-tune

Hi @ken-tune .

Presumably based on your usage of kafka_tag you're using AWS and dynamic inventory. What script/tooling are you using for this?

At first glance it looks like maybe you're defining this at the playbook level so it may not be evaluated per host, as opposed to using it in the host vars...? For example, if this evaluates to 1 initially and is then passed to the role as the default value then due to there not being an individual zookeeper_id on each host then it will fall back on the default. See https://github.com/sleighzy/ansible-zookeeper/blob/master/templates/zoo.cfg.j2#L48

server.{{ hostvars[host].zookeeper_id | default(zookeeper_id) }}.............

Presumably the /var/lib/zookeeper/myid file on all your nodes contain this same 1 value as well?

I don't know how you currently tag your EC2 instances but you could specify this value on a tag there, e.g. named zookeeper_id, and then use that in the host vars although I don't know how much additional overhead this adds for you.

I'll need to set aside some time to test similar config to yours locally to see if I can replicate and provide something more helpful.

sleighzy avatar Nov 05 '21 05:11 sleighzy

Hi @sleighzy

If it helps, what's seen on the other two hosts is

server.2=ip-10-0-0-65.ec2.internal:2888:3888 server.2=ip-10-0-1-234.ec2.internal:2888:3888 server.2=ip-10-0-2-207.ec2.internal:2888:3888

&

server.3=ip-10-0-0-65.ec2.internal:2888:3888 server.3=ip-10-0-1-234.ec2.internal:2888:3888 server.3=ip-10-0-2-207.ec2.internal:2888:3888

with the content of /var/lib/zookeeper/myid matching the X in server.X in each case. So zookeeper_id is being evaluated at the host level.

I can figure out a workaround ( run a subsequent process to correct) so maybe will just leave this with you and if I have some free cycles will figure out a patch if you don't beat me to it.

ken-tune avatar Nov 05 '21 10:11 ken-tune