libcluster
libcluster copied to clipboard
Node is Not connecting with Kubernetes.DNSSRV
Hi,
I'm using Elixir.Cluster.Strategy.Kubernetes.DNSSRV for libcluster in AWS EKS where I enabled Istio also.
Here is my configuration:
strategy: Elixir.Cluster.Strategy.Kubernetes.DNSSRV, config: [ service: "settings-v3-service", namespace: "kandula-dev", application_name: "settings", polling_interval: 10_000 ], connect: {:net_kernel, :connect_node, []}, disconnect: {:erlang, :disconnect_node, []}, list_nodes: {:erlang, :nodes, [:connected]} ]
I'm using stateful sets.
If I do hostname -f cmd it gives like this: "settings-0.settings-v3-service.kandula-dev.svc.cluster.local"
My node name structure is like this: "settings@settings-0.settings-v3-service.kandula-dev.svc.cluster.local"
If I use Node.connect I can able to connect.
But with libcluster it is not connecting. it throws the below error.
the log is throwing this error:
` 2020-11-26T10:00:02.191644371Z 10:00:02.191 [warn] [libcluster:kandula_settings] unable to connect to :"[email protected]"
`
If I do dig SRV settings-v3-service.kandula-dev.svc.cluster.local
it is not returning my stateful set list.
dig srv command result:
` root@settings-0:/app# dig SRV settings-v3-service.kandula-dev.svc.cluster.local
; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> SRV settings-v3-service.kandula-dev.svc.cluster.local ;; global options: +cmd ;; Got answer: ;; WARNING: .local is reserved for Multicast DNS ;; You are currently testing what happens when an mDNS query is leaked to DNS ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 50050 ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 2 ;; WARNING: recursion requested but not available
;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ; COOKIE: 28b65516a3c20ebb (echoed) ;; QUESTION SECTION: ;settings-v3-service.kandula-dev.svc.cluster.local. IN SRV
;; ANSWER SECTION: settings-v3-service.kandula-dev.svc.cluster.local. 5 IN SRV 0 100 80 settings-v3-service.kandula-dev.svc.cluster.local.
;; ADDITIONAL SECTION: settings-v3-service.kandula-dev.svc.cluster.local. 5 IN A 10.100.3.101
;; Query time: 2 msec ;; SERVER: 10.100.0.10#53(10.100.0.10) ;; WHEN: Thu Nov 26 09:59:11 UTC 2020 ;; MSG SIZE rcvd: 273
root@settings-0:/app# . `
Thanks in advance.
@mohandass-pat did you ever get anywhere with this ?
I am having same issue where no matter what strategy i am using alongside with istio
on my cluster. My pods cannot seem to connect to one another
I was facing a similar issue until I set the following environment variables for a mix release to modify the Erlang node options:
RELEASE_DISTRIBUTION=name
RELEASE_NODE=my-app
config :libcluster,
topologies: [
k8s_example: [
strategy: Elixir.Cluster.Strategy.Kubernetes.DNSSRV,
config: [
...
application_name: "my-app",
polling_interval: 10_000
]
]
]
I believe :application_name
has to match the Erlang node name (-name
or RELEASE_NODE
value).
I believe the Elixir.Cluster.Strategy.Kubernetes.DNSSRV
requires that we use -name
or RELEASE_DISTRIBUTION=name
so that the Node's host portion is fully qualified and looks something like this:
iex1> Node.self()
:"[email protected]"
If it is set to -sname
or RELEASE_DISTRIBUTION=sname
then the host portion of the node is not fully qualified (:"my-app@my-app-0"
) and will not work.