Are the new services names absolutely necessary?
Name and Version
bitnami/kafka 32.2.13
What is the problem this feature will solve?
Service names have changed from a few versions ago, having...
<release name>-kafka-controller-<instance number>-external
...to...
<release name>-kafka-controller-<instance number>.<release name>-kafka-controller-headless.<release name>}.svc.cluster.local
This is given essentially the same configuration. Have I done something wrong? Is there some new mapping I need to provide to avoid this punishment? This took a while to find because I was not expecting any change, and then to find out this monster was chosen? It seems unnecessary.
What is the feature you are proposing to solve the problem?
Go back to the old service name, or allow an alias to be mapped.
Hi @zenbones
Could you please share the values you're using? The "external" names shouldn't be affected by any change we introduced on internal names. Thanks in advance.
Here's the entire section. Changes from the prior version are...
- externalAccess.service.broker and externalAccess.service.controller -> externalAcces.broker.service and externalAcces.controller.service
- added defaultInitContainers.autoDiscovery.enabled
- added broker.automountServiceAccountToken and controller.automountServiceAccountToken
- changed extraConfiguration: [] -> overrideConfiguration: {}
...and I think that's about it.
kafka:
global:
storageClass: gp2
externalAccess:
enabled: true
autoDiscovery:
enabled: true
broker:
service:
type: LoadBalancer
ports:
external: 9095
controller:
service:
type: LoadBalancer
ports:
external: 9095
annotations:
external-dns.alpha.kubernetes.io/hostname: "{{ .targetPod }}.forio.internal"
serviceAccount:
create: true
rbac:
create: true
defaultInitContainers:
autoDiscovery:
enabled: true
tls:
sslClientAuth: none
listeners:
client:
protocol: PLAINTEXT
controller:
protocol: PLAINTEXT
interbroker:
protocol: PLAINTEXT
external:
protocol: PLAINTEXT
broker:
automountServiceAccountToken: true
controller:
replicaCount: 2
automountServiceAccountToken: true
nodeSelector:
kubernetes.io/os: linux
silo: production-gxl
podAntiAffinityPreset: soft
topologySpreadConstraints:
- maxSkew: 1
topologyKey: "topology.kubernetes.io/zone"
whenUnsatisfiable: "ScheduleAnyway"
labelSelector:
matchLabels:
app.kubernetes.io/instance: "{{ .Release.Name }}"
app.kubernetes.io/name: "kafka"
pdb:
create: true
maxUnavailable: 1
extraEnvVars:
- name: "JMX_PORT"
value: "9101"
overrideConfiguration:
# For Acks=0
# as few as 1 might be fine, consumer threads * broker nodes, at a guess (24 is still small)
num.partitions: 24
# default for number of replicas
default.replication.factor: 1
# min.insync.replicas=1, acks=0, replication.factor=1 for fast and un-replicated
min.insync.replicas: 1
# The Best Of The Rest
# must have
auto.create.topics.enable: true
# allow re-balances
auto.leader.rebalance.enable: true
# none, gzip, lz4, snappy, and zstd (prefer lz4 as fastest if not smallest)
compression.type: lz4
# seems like we should
delete.topic.enable: true
# default 3000 (wait for consumers to join before first re-balance)
group.initial.rebalance.delay.ms: 3000
# how often to run the re-balance check
leader.imbalance.check.interval.seconds: 300
# The allowed percentage of partitions for which the broker is not the preferred leader before a re-balance occurs
leader.imbalance.per.broker.percentage: 10
# general consensus
log.cleaner.backoff.ms: 15000
# (24 hours) the retention time for deleted tombstone markers (as we do not use keys this should make no difference)
log.cleaner.delete.retention.ms: 86400000
# we want this
log.cleaner.enable: true
# if we were using keys we might use 'compact,delete'
log.cleanup.policy: delete
# (5 minutes) should be lower than log.retention.ms
log.retention.check.interval.ms: 300000
# ttl for messages (28 minutes)
log.retention.ms: 1680000
# maybe not necessary but not harmful
log.segment.delete.delay.ms: 60000
# default 1048588 (1mb, but slightly more than replica.fetch.max.bytes)
message.max.bytes: 1048588
# default 8
num.io.threads: 8
# default 3
num.network.threads: 3
# general consensus
num.recovery.threads.per.data.dir: 2
# we curretly have just 2 servers
offsets.topic.replication.factor: 2
# default 500 (limit the size of the request queue before the network thread is blocked)
queued.max.requests: 500
# default (1mb, but slightly less than message.max.bytes)
replica.fetch.max.bytes: 1048576
# default 30000 (upper limit on how long a producer must wait for acknowledgement, but also how quickly slow queues are removed)
replica.lag.time.max.ms: 10000
# we curretly have just 2 servers
transaction.state.log.min.isr: 2
# we curretly have just 2 servers
transaction.state.log.replication.factor: 2
# defaults to false, but if there's no in-sync follower when a leader fails, then no leader can be elected
unclean.leader.election.enable: true
podAnnotations:
cluster-autoscaler.kubernetes.io/safe-to-evict: "false"
ad.datadoghq.com/kafka.check_names: '["kafka"]'
ad.datadoghq.com/kafka.init_configs: '[{"is_jmx": true, "collect_default_metrics": true}]'
ad.datadoghq.com/kafka.instances: |
[
{
"host": "%%host%%",
"port": "9101"
}
]
heapOpts: "-Xmx1024m -Xms1024m"
resources:
limits:
cpu: 1000m
memory: 1792Mi
Hi @zenbones
I've tried to reproduce your issue running the command below (with the parameters that I think that are relevant to reproduce it):
helm template kafka oci://registry-1.docker.io/bitnamicharts/kafka \
--set externalAccess.enabled=true \
--set defaultInitContainers.autoDiscovery.enabled=true \
--set externalAccess.controller.service.type=LoadBalancer,externalAccess.broker.service.type=LoadBalancer \
--set rbac.create=true,controller.automountServiceAccountToken=true,broker.automountServiceAccountToken=true
It's true that server.properties configuration file includes the property below:
controller.quorum.bootstrap.servers=kafka-controller-0.kafka-controller-headless.default.svc.cluster.local:9093,kafka-controller-1.kafka-controller-headless.default.svc.cluster.local:9093,kafka-controller-2.kafka-controller-headless.default.svc.cluster.local:9093
However, that shouldn't affect the "external" listener and the logic used by the "auto-discovery" and "prepare-config" init-containers that update the advertised.listeners shouldn't append <release name>-kafka-controller-headless.<release name>}.svc.cluster.local suffix for external listeners either.
With this in mind, could you exec into one of your Kafka pods and share the value of advertised.listeners at server.properties? Are you observing the suffix somewhere else?
Just in case, I grabbed the whole thing...
# Listeners configuration
listeners=CLIENT://:9092,INTERNAL://:9094,EXTERNAL://:9095,CONTROLLER://:9093
advertised.listeners=CLIENT://prod1-kafka-controller-0.prod1-kafka-controller-headless.prod1.svc.cluster.local:9092,INTERNAL://prod1-kafka-controller-0.prod1-kafka-controller-headless.prod1.svc.cluster.local:9094,EXTERNAL://ad35bef8b64fd434ab71abbcb0cfa770-816616358.us-east-1.elb.amazonaws.com:9094
listener.security.protocol.map=CLIENT:PLAINTEXT,INTERNAL:PLAINTEXT,CONTROLLER:PLAINTEXT,EXTERNAL:PLAINTEXT
# KRaft process roles
process.roles=controller,broker
node.id=0
controller.listener.names=CONTROLLER
controller.quorum.voters=0@prod1-kafka-controller-0.prod1-kafka-controller-headless.prod1.svc.cluster.local:9093,1@prod1-kafka-controller-1.prod1-kafka-controller-headless.prod1.svc.cluster.local:9093
# Kafka data logs directory
log.dir=/bitnami/kafka/data
# Kafka application logs directory
logs.dir=/opt/bitnami/kafka/logs
# Common Kafka Configuration
# Interbroker configuration
inter.broker.listener.name=INTERNAL
# Custom Kafka Configuration
# For Acks=0
# as few as 1 might be fine, consumer threads * broker nodes, at a guess (24 is still small)
num.partitions=24
# default for number of replicas
default.replication.factor=1
# min.insync.replicas=1, acks=0, replication.factor=1 for fast and un-replicated
min.insync.replicas=1
# The Best Of The Rest
# must have
auto.create.topics.enable=true
# allow re-balances
auto.leader.rebalance.enable=true
# none, gzip, lz4, snappy, and zstd (prefer lz4 as fastest if not smallest)
compression.type=lz4
# seems like we should
delete.topic.enable=true
# default 3000 (wait for consumers to join before first re-balance)
group.initial.rebalance.delay.ms=3000
# how often to run the re-balance check
leader.imbalance.check.interval.seconds=300
# The allowed percentage of partitions for which the broker is not the preferred leader before a re-balance occurs
leader.imbalance.per.broker.percentage=10
# general consensus
log.cleaner.backoff.ms=15000
# (24 hours) the retention time for deleted tombstone markers (as we do not use keys this should make no difference)
log.cleaner.delete.retention.ms=86400000
# we want this
log.cleaner.enable=true
# if we were using keys we might use 'compact,delete'
log.cleanup.policy=delete
# (5 minutes) should be lower than log.retention.ms
log.retention.check.interval.ms=300000
# ttl for messages (28 minutes)
log.retention.ms=1680000
# maybe not necessary but not harmful
log.segment.delete.delay.ms=60000
# default 1048588 (1mb, but slightly more than replica.fetch.max.bytes)
message.max.bytes=1048588
# default 8
num.io.threads=8
# default 3
num.network.threads=3
# general consensus
num.recovery.threads.per.data.dir=1
# we curretly have just 2 servers
offsets.topic.replication.factor=2
# default 500 (limit the size of the request queue before the network thread is blocked)
queued.max.requests=500
# default (1mb, but slightly less than message.max.bytes)
replica.fetch.max.bytes=1048576
# default 30000 (upper limit on how long a producer must wait for acknowledgement, but also how quickly slow queues are removed)
replica.lag.time.max.ms=10000
# we curretly have just 2 servers
transaction.state.log.min.isr=2
# we curretly have just 2 servers
transaction.state.log.replication.factor=2
# defaults to false, but if there's no in-sync follower when a leader fails, then no leader can be elected
unclean.leader.election.enable=true
Apologies for the long delay, I was away on a family matter,
I can see the following advertised listener:
EXTERNAL://ad35bef8b64fd434ab71abbcb0cfa770-816616358.us-east-1.elb.amazonaws.com:9094
That's not following the <release name>-kafka-controller-<instance number>-external pattern you mentioned.
Let's look at the whole list, which might help me understand as well...
- CLIENT://prod1-kafka-controller-0.prod1-kafka-controller-headless.prod1.svc.cluster.local:9092
- ,INTERNAL://prod1-kafka-controller-0.prod1-kafka-controller-headless.prod1.svc.cluster.local:9094
- EXTERNAL://ad35bef8b64fd434ab71abbcb0cfa770-816616358.us-east-1.elb.amazonaws.com:9094
The EXTERNAL is the load balancer from the controller service type. This actual address is useless to me because it will change with every helm startup, and I can't add it to any external dependencies because I don't know what it will be and have no good method for discovery. Fortunately, I'm not really interested in external clients at the moment, and could change the service type, but that may change in the future and I would like the option to get this working with a clean external dns name.
The current clients are INTERNAL, and that is definitely the dns name that has changed. Previously, it was prod1-kefka-controller-0-external, which I believe is the service name. There may be some values entry that changes that service name, but I'm not using it, and prod1-kefka-controller-0-external does kind of make sense. However...
prod1-kafka-controller-0.prod1-kafka-controller-headless.prod1.svc.cluster.local
...which also makes some sense, also seems gratuitously long and unnecessary, and is undocumented as far as I can tell, and not what one would expect from 'helm get services -n
I know this is not a showstopper. I was just wondering why the change from the plain service name, which used to be the default in the advertised listeners, and why this longer name is not documented?
Hi @zenbones
I was just wondering why the change from the plain service name, which used to be the default in the advertised listeners
That's how A/AAAA records are assigned on K8s, see:
- https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#a-aaaa-records
You can use the FQDN or a short names given CoreDNS in K8s also configure them automatically. The DNS search path typically looks like this (check with cat /etc/resolv.conf in a Pod):
search my-namespace.svc.cluster.local svc.cluster.local cluster.local
So when you try to access the service foo, Kubernetes tries these in order:
foo.my-namespace.svc.cluster.local✅ (match found)foo.svc.cluster.localfoo.cluster.local
That's why using a short name works within the namespace. However, if you try to access foo from a different namespace it will fail. Using FQDN you won't experience this given you're including the namespace on the record.
So, why, in the earlier version of this chart, did prod1-kafka-controller-0-external work, just fine, and now I need to use prod1-kafka-controller-0.prod1-kafka-controller-headless.prod1.svc.cluster.local? With no change in my configuration (the values.yaml), so I'm thinking it must be a change in the chart? Am I wrong?
What I'm asking is, what in the chart has changed? Did you mean it to change? Why is this change not documented?
I'm not upset about the suddenly longer name, although it's a little annoying, I am upset about an unannounced undocumented change, that, maybe, you did not intend?
It's very likely the change was introduced by me at https://github.com/bitnami/charts/pull/32516 as part of the changes to remove support to ZooKeeper, see:
- https://github.com/bitnami/charts/tree/main/bitnami/kafka#to-3200
On this version, we switched from static quorum to dynamic quorum on Kraft, and these changes implies some changes in the DNS names we set on the controller.quorum.bootstrap.servers property.
This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.
That's a very reasonable answer. Maybe add a note to the docs or the changelog? Thanks for clarifying this for me.