libcluster icon indicating copy to clipboard operation
libcluster copied to clipboard

[error] ** System NOT running to use fully qualified hostnames ** Kubernetes DNSSRV Strategy

Open paltaa opened this issue 4 years ago • 32 comments

So I've followed this tutorial:

https://tech.xing.com/creating-an-erlang-elixir-cluster-on-kubernetes-d53ef89758f6

On the logs getting this errors:

14:02:03.126 [error] ** System NOT running to use fully qualified hostnames ** ** Hostname 192-168-4-107.ueuropea-excalibur-headless-service.default.svc.cluster.local is illegal ** 14:02:03.325 [warn] [libcluster:k8s_excalibur] unable to connect to :"excalibur@192-168-4-107.ueuropea-excalibur-headless-service.default.svc.cluster.local"

`root@ueuropea-excalibur-74df5dddbc-kjfql:/excalibur# nslookup -q=srv ueuropea-excalibur-headless-service.default.svc.cluster.local Server: 10.100.0.10 Address: 10.100.0.10#53

Non-authoritative answer: ueuropea-excalibur-headless-service.default.svc.cluster.local service = 0 33 4000 192-168-4-107.ueuropea-excalibur-headless-service.default.svc.cluster.local. ueuropea-excalibur-headless-service.default.svc.cluster.local service = 0 33 4000 192-168-53-92.ueuropea-excalibur-headless-service.default.svc.cluster.local. ueuropea-excalibur-headless-service.default.svc.cluster.local service = 0 33 4000 192-168-65-161.ueuropea-excalibur-headless-service.default.svc.cluster.local.

Authoritative answers can be found from: 192-168-65-161.ueuropea-excalibur-headless-service.default.svc.cluster.local internet address = 192.168.65.161 192-168-53-92.ueuropea-excalibur-headless-service.default.svc.cluster.local internet address = 192.168.53.92 192-168-4-107.ueuropea-excalibur-headless-service.default.svc.cluster.local internet address = 192.168.4.107 `

It seems its missing a . at the end of the returned DNS right?

Is there something wrong with the tutorial? something could be missing?

Entrypoint:

`mix release #elixir -S mix phx.server --name excalibur@${MY_POD_IP} --cookie "secret"

_build/prod-kubernetes/rel/excalibur/bin/excalibur start`

Deployment:

apiVersion: extensions/v1beta1 kind: Deployment metadata: labels: io.kompose.service: excalibur name: ueuropea-excalibur spec: progressDeadlineSeconds: 90 replicas: 3 strategy: type: Recreate template: metadata: labels: io.kompose.service: excalibur spec: containers: - image: 975847796244.dkr.ecr.us-west-2.amazonaws.com/excalibur:dnsrv name: excalibur imagePullPolicy: Always ports: - containerPort: 4000 - containerPort: 4369 env: - name: DATABASE_URL value: ***** - name: MIX_ENV value: prod-kubernetes - name: SECRET_KEY_BASE value: ***** - name: ERLANG_COOKIE value: ****** - name: MY_POD_IP valueFrom: fieldRef: fieldPath: status.podIP - name: ERLANG_CLUSTER_SERVICE_NAME value: ueuropea-excalibur-headless-service resources: {} restartPolicy: Always status: {}

Headless service:

apiVersion: v1 kind: Service metadata: labels: io.kompose.service: excalibur name: ueuropea-excalibur-headless-service spec: type: ClusterIP clusterIP: None ports: - name: "http" port: 80 targetPort: 4000 publishNotReadyAddresses: true selector: io.kompose.service: excalibur

Let me know if any more information is needed

paltaa avatar Mar 03 '20 14:03 paltaa

Hi @paltaa - I'm the author of the Kubernetes DNSSRV Strategy - and I've just discovered the same problem in some training material I'm supposed to be teaching tomorrow 😂!!!!

This is highly annoying - looks like an arbitrary change in statefulset DNS names - I'll investigate - if only for my own selfish reasons and report back.

Thanks for the bug report

bryanhuntesl avatar Mar 03 '20 22:03 bryanhuntesl

@bitwalker - assign this to me if you like

bryanhuntesl avatar Mar 03 '20 22:03 bryanhuntesl

Reason for breakage:

Google Cloud have stopped using CoreDNS as DNS resolver as of GKE 1.1.3 - they are instead using kube-dns which doesn't provide service discovery via SRV record resolution. (obviously they want everyone using k8s API for everything).

Lame.

Release notes (1.1.3) - they switched : https://cloud.google.com/kubernetes-engine/docs/release-notes#new_features_17

Someone trying to debug the issue : https://github.com/kubernetes/kubernetes/issues/85759

Stackoverflow thread : https://stackoverflow.com/questions/55122234/installing-coredns-on-gke

The kubernetes/docker ecosystem is just like that - stuff arbitrarily breaks all the time - recommend you try using Paul's strategy/kubernetes instead .

I'm away but will set a reminder to create a docs PR - maybe changing the name to strategy/k8s-coredns-srv or something that makes it clear you need coredns.

Man - that Google - always causing problems ! Sorry !

bryanhuntesl avatar Mar 03 '20 22:03 bryanhuntesl

Thanks a lot for the reply!! Been trying to make this work for a couple of days, followed about 4 different tutorials hahaha, glad to know I helped by posting this issue! let me know if you manage to fix this, good luck tomorrow

paltaa avatar Mar 03 '20 23:03 paltaa

@bryanhuntesl Also, this is currently happening on AWS EKS, which still uses coreDNS, by making some tweaks the illegal hostname error and system not running fqdns too, but still it wont connect

paltaa avatar Mar 04 '20 13:03 paltaa

I have the same problem.

My environment is aks (azure kube) v1.15.7. I had the same problem using kubernetnes.DNS strategy.

mrchypark avatar Mar 20 '20 05:03 mrchypark

@mrchypark Hey, what worked for me is Elixir.Cluster.Strategy.Kubernetes

My example: topologies = [ k8s_excalibur: [ strategy: Elixir.Cluster.Strategy.Kubernetes, config: [ service: System.get_env("ERLANG_CLUSTER_SERVICE_NAME"), application_name: "excalibur", kubernetes_node_basename: "excalibur", kubernetes_namespace: "default", kubernetes_selector: "io.kompose.service=excalibur" ] ] ]

paltaa avatar Mar 20 '20 12:03 paltaa

@paltaa Thank you for your reply! I have question.

What is service means? It's service resource on kubernetes?

mrchypark avatar Mar 20 '20 13:03 mrchypark

Yes, the service for kubernetes deployments

paltaa avatar Mar 20 '20 13:03 paltaa

@paltaa Thank you! I'll try this :)

mrchypark avatar Mar 20 '20 13:03 mrchypark

@paltaa I have more question T.T

what is your node name now?

mine is like hydra@hydraapp-799457d75f-948qc.

I have no error but empty node list too.

mrchypark avatar Mar 20 '20 13:03 mrchypark

You need to use distillery, setup a pre hook before the erlang VM is up and setup the ENV VAR ERLANG_NODE with its local ip

paltaa avatar Mar 20 '20 13:03 paltaa

local ip means pod ip?

mrchypark avatar Mar 20 '20 13:03 mrchypark

For example.

Deployment part:

        - name: MY_POD_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP

Then the pre hook:

#!/bin/sh

echo 'Setting ERLANG_NAME...' export ERLANG_NAME=$MY_POD_IP echo $ERLANG_NAME export ERLANG_COOKIE=**** echo $ERLANG_COOKIE

vm.args:

-name excalibur@${ERLANG_NAME}

-setcookie ${ERLANG_COOKIE}

paltaa avatar Mar 20 '20 13:03 paltaa

I maybe pass the set cookie.

I'll try this.

mrchypark avatar Mar 20 '20 13:03 mrchypark

And yes, by local ip I ment pod IP

paltaa avatar Mar 20 '20 13:03 paltaa

I recommend using KubernetesDNS but it requires a headless service.

seivan avatar Apr 24 '20 22:04 seivan

@bryanhuntesl I've assigned this to you as requested, but my question to everyone participating in this thread is whether or not the KubernetesDNS strategy is suitable as a replacement, allowing us to deprecate the DNSSRV strategy if there are issues that make it difficult to use.

I'm fine with not deprecating it, but someone from the community has to speak up and take the lead on updating the strategy as appropriate so that it works out of the box. I'm also fine with deprecating it here in libcluster, but handing off the implementation to someone to maintain on their own as a third-party plugin, just let me know. Suffice to say, I won't have time to maintain it myself in the immediate future, and I don't like to keep things around that are broken either, so I'll have to make the call soon.

bitwalker avatar Apr 25 '20 16:04 bitwalker

Just to clarify, is this only an issue where CoreDNS isn't available? Afaik CoreDNS is now on EKS since .12, not sure about GKE.

seivan avatar Apr 25 '20 17:04 seivan

Just to clarify, is this only an issue where CoreDNS isn't available? Afaik CoreDNS is now on EKS since .12, not sure about GKE.

This is an issue because Google Kubernetes Engine removed CoreDNS - the

Just to clarify, is this only an issue where CoreDNS isn't available? Afaik CoreDNS is now on EKS since .12, not sure about GKE.

https://github.com/bitwalker/libcluster/issues/121#issuecomment-594207646

bryanhuntesl avatar Apr 27 '20 08:04 bryanhuntesl

@bryanhuntesl Right, so when I implemented the first k8 service discovery strategy (KubernetesDNS) I assumed it was just a "standardized" implementation based on some K8 spec from the first paragraph https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#services

Which makes it extra scary when it's documented under K8 but is implementation specific!

However I wonder if this is related https://github.com/kubernetes/dns/issues/339#issuecomment-594798682

Which means it's an issue when using hostnames to pods, and not the actual address. So that's probably why endpoint_pod_names doesn't exist on kube-dns but just guessing here.

I don't have access to GKE/kube-dns, could you test to see if KubernetesDNS works since it just returns addresses.

That's an alternative (and one I use) unless IPs aren't good enough or you're using shared hostnames on the pods, which isn't necessary if your intention is to just setup an Erlang cluster.

@bitwalker I don't suggest removing it, but renaming it to something like CoreDNSSRV or whatever @bryanhuntesl suggested, but definitely not merging them now.

But in general, if KubernetesDNS works on kube-dns, I really suggest that should be the default or recommend approach if the intention is to make an Erlang cluster. There is no need to complicate it with SRV lookup and shared hostnames if the intention is just to join nodes together into a cluster.

seivan avatar Apr 27 '20 09:04 seivan

I don't have access to GKE/kube-dns, could you test to see if KubernetesDNS works since it just returns addresses.

@seivan sorry I just don't have bandwidth right now, I'm assigned to client work.

@bitwalker I don't suggest removing it, but renaming it to something like CoreDNSSRV or whatever @bryanhuntesl suggested, but definitely not merging them now.

@bitwalker if renaming to Cluster.Strategy.CoreDNSSRV is acceptable - I can create a PR in the evening and update the documentation.

bryanhuntesl avatar Apr 27 '20 09:04 bryanhuntesl

@bryanhuntesl No worries, thanks for all the input. I recall testing locally with Minkube a year ago or so and that in turn runs Kube-dns, unless anything has changed, I'm thinking it's fine actually.

Besides if it didn't work with Kube-dns then it really doesn't serve a purpose to have headless services at all, not to mention it would be a pretty blatan flaw on the k8 docs.

seivan avatar Apr 27 '20 11:04 seivan

@bryanhuntesl Sorry for the delay, haven't had a chance to get back to this in a while. I'm good with renaming the strategy, and documenting the caveats. I'll have to bump the major version for the release, but that's fine, we're probably due for that.

bitwalker avatar May 18 '20 14:05 bitwalker

I am getting this same error using this strategy. I have tried both :ip and :dns modes ** System NOT running to use fully qualified hostnames ** ** Hostname 10.32.9.22 is illegal **

config :libcluster,
  topologies: [
    k8s: [
      strategy: Elixir.Cluster.Strategy.Kubernetes,
      config: [
        mode: :ip,
        kubernetes_node_basename: "server",
        kubernetes_selector: "app.kubernetes.io/instance=server",
        kubernetes_namespace: "backend"
      ]
    ]
  ]

michaelst avatar Aug 10 '20 23:08 michaelst

I'm getting similar errors as @michaelst

09:08:47.379 [warn] [libcluster:k8s] unable to connect to :"[email protected]"
09:08:47.379 [error] ** System NOT running to use fully qualified hostnames **
** Hostname 10.0.2.165 is illegal **

Elixir version: 1.11.1

config :libcluster,
  topologies: [
    k8s: [
      strategy: Cluster.Strategy.Kubernetes,
      config: [
        kubernetes_node_basename: "community_service",
        kubernetes_selector: "app=community-service,role=api",
        kubernetes_namespace: System.get_env("NAMESPACE", "community-service"),
        polling_interval: 15_000
      ]
    ]
  ]

~All tutorials I find uses distillery. Is distillery some kind of implicit hard requirement to use libcluster? I'm currently just using the built-in mix release and following the documents doesn't get a working example regardless of which kubernetes strategy I use. They all emit the same errors.~

~What am I missing here?~

Issue resolved

Found out that I can get the templates of env.sh.eex and others with mix release.init and after tweaking the files, it seems to be working. Not sure how to verify though.

darwin67 avatar Dec 04 '20 09:12 darwin67

I am facing a similar problem. I'm using Kind to test my service and I get the error below

2021-04-02 01:52:34.708 [massa_proxy@massa-proxy-c9885df8-ffbhz]:[pid=<0.2378.0> ]:[error]:** System NOT running to use fully qualified hostnames **
** Hostname 10.244.0.27 is illegal **

kind version 0.7.0

Headless Service:

apiVersion: v1
kind: Service
metadata:
  name: proxy-headless-svc
  namespace: default
spec:
  selector:
    app: massa-proxy
  clusterIP: None

Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: massa-proxy
  name: massa-proxy
spec:
  progressDeadlineSeconds: 600
  replicas: 2
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: massa-proxy
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      annotations:
        prometheus.io/port: "9001"
        prometheus.io/scrape: "true"
      labels:
        app: massa-proxy
    spec:
      containers:
      - name: massa-proxy
        image: docker.io/eigr/massa-proxy:0.1.0
        ports:
        - containerPort: 9001
        imagePullPolicy: Always
        env:
        - name: PROXY_POD_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /health
            port: 9001
            scheme: HTTP
          initialDelaySeconds: 300
          periodSeconds: 3600
          successThreshold: 1
          timeoutSeconds: 1200
        resources:
          limits:
            memory: 1024Mi
          requests:
            memory: 70Mi
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        envFrom:
        - configMapRef:
            name: proxy-cm
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30

My topology:

[proxy: [strategy: Cluster.Strategy.Kubernetes.DNS, config: [service: "proxy-headless-svc", application_name: "massa-proxy", polling_interval: 3000]]]

Dockerfile:

FROM elixir:1.10-alpine as builder

ENV MIX_ENV=prod

RUN mkdir -p /app/massa_proxy
WORKDIR /app/massa_proxy

RUN apk add --no-cache --update git build-base zstd

COPY . /app/massa_proxy

RUN rm -rf /app/massa_proxy/apps/massa_proxy/mix.exs \
    && mv /app/massa_proxy/apps/massa_proxy/mix-bakeware.exs \
          /app/massa_proxy/apps/massa_proxy/mix.exs

RUN mix local.rebar --force \
    && mix local.hex --force \
    && mix deps.get 

RUN echo "-name massa_proxy@${PROXY_POD_IP}" >> /app/massa_proxy/apps/massa_proxy/rel/vm.args.eex \
      && echo "-setcookie ${NODE_COOKIE}" >> /app/massa_proxy/apps/massa_proxy/rel/vm.args.eex

RUN rm -fr /app/massa_proxy/_build \
    && cd /app/massa_proxy/apps/massa_proxy \
    && mix deps.get \
    && mix release.init \
    && mix release

# ---- Application Stage ----
FROM alpine:3
RUN apk add --no-cache --update bash openssl

WORKDIR /home/app
COPY --from=builder /app/massa_proxy/_build/prod/rel/bakeware/ .
COPY apps/massa_proxy/priv /home/app/

RUN adduser app --disabled-password --home app

RUN mkdir -p /home/app/cache
RUN chown -R app: .

USER app

ENV MIX_ENV=prod
ENV REPLACE_OS_VARS=true
ENV BAKEWARE_CACHE=/home/app/cache
ENV PROXY_TEMPLATES_PATH=/home/app/templates

ENTRYPOINT ["./massa_proxy"]

I am already losing hope that I can resolve this. Does anyone know what it could be?

sleipnir avatar Apr 02 '21 02:04 sleipnir

@adriano

RUN echo "-name massa_proxy@${PROXY_POD_IP}" >> /app/massa_proxy/apps/massa_proxy/rel/vm.args.eex \

have you tried ?

RUN echo "-sname massa_proxy@${PROXY_POD_IP}" >> /app/massa_proxy/apps/massa_proxy/rel/vm.args.eex \

An IP address is not a 'name' as far as Erlang is concerned, a 'name' is a FQDN (fully qualified domain name) such as node-0...cluster.local.

On Fri, 2 Apr 2021 at 03:09, Adriano Santos @.***> wrote:

I am facing a similar problem. I'm using Kind to test my service and I get the error below

2021-04-02 01:52:34.708 @.*:[pid=<0.2378.0> ]:[error]: System NOT running to use fully qualified hostnames ** ** Hostname 10.244.0.27 is illegal **

kind version 0.7.0

Headless Service:

apiVersion: v1 kind: Service metadata: name: proxy-headless-svc namespace: default spec: selector: app: massa-proxy clusterIP: None

Deployment:

apiVersion: apps/v1 kind: Deployment metadata: labels: app: massa-proxy name: massa-proxy spec: progressDeadlineSeconds: 600 replicas: 2 revisionHistoryLimit: 10 selector: matchLabels: app: massa-proxy strategy: rollingUpdate: maxSurge: 1 maxUnavailable: 1 type: RollingUpdate template: metadata: annotations: prometheus.io/port: "9001" prometheus.io/scrape: "true" labels: app: massa-proxy spec: containers: - name: massa-proxy image: docker.io/eigr/massa-proxy:0.1.0 ports: - containerPort: 9001 imagePullPolicy: Always env: - name: PROXY_POD_IP valueFrom: fieldRef: fieldPath: status.podIP livenessProbe: failureThreshold: 3 httpGet: path: /health port: 9001 scheme: HTTP initialDelaySeconds: 300 periodSeconds: 3600 successThreshold: 1 timeoutSeconds: 1200 resources: limits: memory: 1024Mi requests: memory: 70Mi terminationMessagePath: /dev/termination-log terminationMessagePolicy: File envFrom: - configMapRef: name: proxy-cm dnsPolicy: ClusterFirst restartPolicy: Always schedulerName: default-scheduler securityContext: {} terminationGracePeriodSeconds: 30

My topology:

[proxy: [strategy: Cluster.Strategy.Kubernetes.DNS, config: [service: "proxy-headless-svc", application_name: "massa-proxy", polling_interval: 3000]]]

Dockerfile:

FROM elixir:1.10-alpine as builder

ENV MIX_ENV=prod

RUN mkdir -p /app/massa_proxy WORKDIR /app/massa_proxy

RUN apk add --no-cache --update git build-base zstd

COPY . /app/massa_proxy

RUN rm -rf /app/massa_proxy/apps/massa_proxy/mix.exs
&& mv /app/massa_proxy/apps/massa_proxy/mix-bakeware.exs
/app/massa_proxy/apps/massa_proxy/mix.exs

RUN mix local.rebar --force
&& mix local.hex --force
&& mix deps.get

RUN echo "-name massa_proxy@${PROXY_POD_IP}" >> /app/massa_proxy/apps/massa_proxy/rel/vm.args.eex
&& echo "-setcookie ${NODE_COOKIE}" >> /app/massa_proxy/apps/massa_proxy/rel/vm.args.eex

RUN rm -fr /app/massa_proxy/_build
&& cd /app/massa_proxy/apps/massa_proxy
&& mix deps.get
&& mix release.init
&& mix release

---- Application Stage ----

FROM alpine:3 RUN apk add --no-cache --update bash openssl

WORKDIR /home/app COPY --from=builder /app/massa_proxy/_build/prod/rel/bakeware/ . COPY apps/massa_proxy/priv /home/app/

RUN adduser app --disabled-password --home app

RUN mkdir -p /home/app/cache RUN chown -R app: .

USER app

ENV MIX_ENV=prod ENV REPLACE_OS_VARS=true ENV BAKEWARE_CACHE=/home/app/cache ENV PROXY_TEMPLATES_PATH=/home/app/templates

ENTRYPOINT ["./massa_proxy"]

I am already losing hope that I can resolve this. Does anyone know what it could be?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bitwalker/libcluster/issues/121#issuecomment-812277089, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHUCR5SFBBQ5D3AKAJCAGSLTGURMFANCNFSM4LALPKAA .

-- ............................................... (PGP) 0x87E3B94D7B2BEEEF (Keybase) @.*** (Github) bryanhuntesl ...............................................

-- * * *Our upcoming conferences: *

Code BEAM V Europe: https://www2.codesync.global/code-sync/code-beam-sto-2021 19-21 May 2021 ElixirConf EU: https://www2.elixirconf.eu/elixir-conf-2021/es 8-10 September 2021 Code Beam SF: https://www2.codesync.global/code-beam-sf-2021/es 4-5 November 2021

Erlang Solutions cares about your data and privacy; please find all details about the basis for communicating with you and the way we process your data in our Privacy Policy https://www.erlang-solutions.com/privacy-policy.html. You can update your email preferences or opt-out from receiving Marketing emails here https://www2.erlang-solutions.com/email-preference?epc_hash=JtO6C7Q2rJwCdZxBx3Ad8jI2D4TJum7XcUWcgfjZ8YY.

bryanhuntesl avatar Apr 02 '21 10:04 bryanhuntesl

Hi @bryanhuntesl, thanks for the quick response. Yes I tried to use -sname to no avail too. In my logs I print Node.self () and see that the name is different from the one configured. Look:

2021-04-02 12:50:53.342 [massa_proxy@massa-proxy-7fbc5b4889-6p5hk]:[pid=<0.1759.0> ]:[info]: Starting HTTP Server on port 9001
2021-04-02 12:50:53.342 [massa_proxy@massa-proxy-7fbc5b4889-6p5hk]:[pid=<0.1759.0> ]:[info]: Cluster Strategy kubernetes-dns
2021-04-02 12:50:53.342 [massa_proxy@massa-proxy-7fbc5b4889-6p5hk]:[pid=<0.1759.0> ]:[debug]:Cluster topology [proxy: [strategy: Cluster.Strategy.Kubernetes.DNS, config: [service: "proxy-headless-svc", application_name: "massa-proxy", polling_interval: 3000]]]
2021-04-02 12:50:53.364 [massa_proxy@massa-proxy-7fbc5b4889-6p5hk]:[pid=<0.1867.0> ]:[error]:** System NOT running to use fully qualified hostnames **
** Hostname 10.244.0.36 is illegal **

2021-04-02 12:50:53.364 [massa_proxy@massa-proxy-7fbc5b4889-6p5hk]:[pid=<0.1864.0> ]:[warn]: [libcluster:proxy] unable to connect to :"[email protected]"
2021-04-02 12:50:53.364 [massa_proxy@massa-proxy-7fbc5b4889-6p5hk]:[pid=<0.1872.0> ]:[error]:** System NOT running to use fully qualified hostnames **
** Hostname 10.244.0.36 is illegal **

2021-04-02 12:50:53.364 [massa_proxy@massa-proxy-7fbc5b4889-6p5hk]:[pid=<0.1864.0> ]:[warn]: [libcluster:proxy] unable to connect to :"[email protected]"
2021-04-02 12:50:53.364 [massa_proxy@massa-proxy-7fbc5b4889-6p5hk]:[pid=<0.1873.0> ]:[info]: Starting Horde.RegistryImpl with name MassaProxy.GlobalRegistry
2021-04-02 12:50:53.365 [massa_proxy@massa-proxy-7fbc5b4889-6p5hk]:[pid=<0.1876.0> ]:[info]: Starting Horde.DynamicSupervisorImpl with name MassaProxy.GlobalSupervisor
2021-04-02 12:50:53.365 [massa_proxy@massa-proxy-7fbc5b4889-6p5hk]:[pid=<0.1880.0> ]:[info]: Starting Proxy Cluster...
2021-04-02 12:50:53.365 [massa_proxy@massa-proxy-7fbc5b4889-6p5hk]:[pid=<0.1880.0> ]:[info]: [massa proxy on :"massa_proxy@massa-proxy-7fbc5b4889-6p5hk"]: Connecting Horde to :"massa_proxy@massa-proxy-7fbc5b4889-6p5hk"
2021-04-02 12:50:53.365 [massa_proxy@massa-proxy-7fbc5b4889-6p5hk]:[pid=<0.1880.0> ]:[info]: [massa proxy on :"massa_proxy@massa-proxy-7fbc5b4889-6p5hk"]: Connecting Horde to :"massa_proxy@massa-proxy-7fbc5b4889

In this test I used: RUN echo "-sname massa-proxy@${PROXY_POD_IP}" >> /app/massa_proxy/apps/massa_proxy/rel/vm.args.eex

sleipnir avatar Apr 02 '21 13:04 sleipnir

It seems to me that the strategy is managing to resolve the addresses correctly, however, the names of the nodes are like massa-proxy@hostname instead of massa-proxy@ip and this seems to me to be the cause of the problem.

root @ sleipnir deployments wip/action-entity-protocol 
└─ # (k8s: kind-kind) 🚀 ▶ k get po
NAME                           READY   STATUS        RESTARTS   AGE
massa-proxy-7b495fbd94-zhrpd   1/1     Terminating   0          7m40s
massa-proxy-fb99dd779-86wlr    1/1     Running       0          34s
massa-proxy-fb99dd779-jbxqd    1/1     Running       0          34s
root @ sleipnir deployments wip/action-entity-protocol 
└─ # (k8s: kind-kind) 🚀 ▶ k exec -it massa-proxy-fb99dd779-86wlr sh
/home/app $ uname -a
Linux massa-proxy-fb99dd779-86wlr 3.10.0-957.27.2.el7.x86_64 #1 SMP Mon Jul 29 17:46:05 UTC 2019 x86_64 Linux

sleipnir avatar Apr 02 '21 13:04 sleipnir