service-dns substitutes ${serviceName} with empty string
Hello,
This is the first time I am using the hazelcast helm chart, and I bumped into a problem right away when trying to use the dns lookup method in kubernetes. To showcase the problem, I cloned the repository first:
git clone [email protected]:hazelcast/charts.git
cd stable/hazelcast/
Then, I created a values-file values-bug.yaml from this and the yaml property from the values.yaml file:
hazelcast:
yaml:
hazelcast:
network:
join:
kubernetes:
service-dns: "${serviceName}.${namespace}.svc.cluster.local"
rest-api:
enabled: true
jet:
enabled: ${hz.jet.enabled}
rbac:
create: false
serviceAccount:
create: false
mancenter:
enabled: false
Then, I ran helm upgrade --install hz-bug -n default -f values-bug.yaml ., which led to the creation of three pods (as expected). Unfortunately, these pods don't find each other:
serviceNameMissing.log
It seems, that the file change causing this problem was done in #384 (later changed to its current state in #414). There, hazelcast.serviceNameConfig was added, and (as far as I can tell) it was deliberately set to an empty string (if kubernetes is true and one of the three properties is set, default "" is applied). And this empty string is then set as Java-Startup-Property -DserviceName=. This is reflected in the log. As we can see, the property is indeed an empty string (... -DserviceName= -Dnamespace=default ...=. So, as this property is the empty string, hazelcast substitutes the variable from the config.yaml with an empty string, leading to an incomplete DNS lookup domain name.
To check my theory, I hardcoded the generated hazelcast service name by changing values-bug.yamlinto this:
hazelcast:
yaml:
hazelcast:
network:
join:
kubernetes:
service-dns: "hz-bug-hazelcast.${namespace}.svc.cluster.local"
rest-api:
enabled: true
jet:
enabled: ${hz.jet.enabled}
rbac:
create: false
serviceAccount:
create: false
mancenter:
enabled: false
After another helm upgrade --install hz-bug -n default -f values-bug.yaml ., the pods were replaced and after a few minutes, they found all three cluster members, as you can see here:
serviceNamePresent.log
Unfortunately, I don't know the intention of the changes made to the chart, so I am unable to propose a fix for the problem. I can only say, that the proposed configuration from the documentation (and from multiple github-issue-discussions) does not lead to a working hazelcast installation.
Hi @focdanisch
Thank you for reporting the bug. After #384, we lost the reference to ${serviceName}. I have created an internal task to address this issue, and it will be fixed soon.
Hi @semihbkgr ,
Are there any updates on this issue?
It is currently in the active sprint, will be fixed soon.