charts icon indicating copy to clipboard operation
charts copied to clipboard

service-dns substitutes ${serviceName} with empty string

Open focdanisch opened this issue 1 year ago • 3 comments

Hello,

This is the first time I am using the hazelcast helm chart, and I bumped into a problem right away when trying to use the dns lookup method in kubernetes. To showcase the problem, I cloned the repository first:

git clone [email protected]:hazelcast/charts.git
cd stable/hazelcast/

Then, I created a values-file values-bug.yaml from this and the yaml property from the values.yaml file:

hazelcast:
  yaml:
    hazelcast:
      network:
        join:
          kubernetes:
            service-dns: "${serviceName}.${namespace}.svc.cluster.local"
        rest-api:
          enabled: true
      jet:
        enabled: ${hz.jet.enabled}
rbac:
  create: false
serviceAccount:
  create: false
mancenter:
  enabled: false

Then, I ran helm upgrade --install hz-bug -n default -f values-bug.yaml ., which led to the creation of three pods (as expected). Unfortunately, these pods don't find each other: serviceNameMissing.log

It seems, that the file change causing this problem was done in #384 (later changed to its current state in #414). There, hazelcast.serviceNameConfig was added, and (as far as I can tell) it was deliberately set to an empty string (if kubernetes is true and one of the three properties is set, default "" is applied). And this empty string is then set as Java-Startup-Property -DserviceName=. This is reflected in the log. As we can see, the property is indeed an empty string (... -DserviceName= -Dnamespace=default ...=. So, as this property is the empty string, hazelcast substitutes the variable from the config.yaml with an empty string, leading to an incomplete DNS lookup domain name.

To check my theory, I hardcoded the generated hazelcast service name by changing values-bug.yamlinto this:

hazelcast:
  yaml:
    hazelcast:
      network:
        join:
          kubernetes:
            service-dns: "hz-bug-hazelcast.${namespace}.svc.cluster.local"
        rest-api:
          enabled: true
      jet:
        enabled: ${hz.jet.enabled}
rbac:
  create: false
serviceAccount:
  create: false
mancenter:
  enabled: false

After another helm upgrade --install hz-bug -n default -f values-bug.yaml ., the pods were replaced and after a few minutes, they found all three cluster members, as you can see here: serviceNamePresent.log

Unfortunately, I don't know the intention of the changes made to the chart, so I am unable to propose a fix for the problem. I can only say, that the proposed configuration from the documentation (and from multiple github-issue-discussions) does not lead to a working hazelcast installation.

focdanisch avatar Jun 27 '24 13:06 focdanisch

Hi @focdanisch

Thank you for reporting the bug. After #384, we lost the reference to ${serviceName}. I have created an internal task to address this issue, and it will be fixed soon.

semihbkgr avatar Jun 28 '24 14:06 semihbkgr

Hi @semihbkgr ,

Are there any updates on this issue?

focdanisch avatar Jul 25 '24 14:07 focdanisch

It is currently in the active sprint, will be fixed soon.

semihbkgr avatar Jul 29 '24 11:07 semihbkgr