Cluster fails to form on AWS EKS IPv6 deployment
Hi,
I'm trying to deploy this helm chart to an AWS EKS IPv6 deployment, but am running into a weird issue.
I run the following command:
KUBECONFIG=/home/DOMAIN/user/.kube/kubeconfig helm upgrade --install redpanda redpanda -n redpanda-ns --create-namespace --set resources.cpu.cores=3 --set resources.enable_memory_locking=true --set resources.memory.container.max="28Gi" --set storage.hostPath="/mnt/data" --set statefulset.podAntiAffinity.type="hard" --set 'statefulset.nodeSelector.company\.com/usage=redpanda' --set storage.persistentVolume.enabled=false --set logging.logLevel="trace"
I see all 3 Red Panda pods come up, but the cluster never forms, via trace logging I see the following error:
TRACE 2022-09-27 14:54:49,542 [shard 0] dns_resolver - Poll sockets
TRACE 2022-09-27 14:54:49,542 [shard 0] dns_resolver - ares_fds: 2
TRACE 2022-09-27 14:54:49,542 [shard 0] dns_resolver - fd 1 r/w
TRACE 2022-09-27 14:54:49,542 [shard 0] dns_resolver - Send 1(2)
TRACE 2022-09-27 14:54:49,542 [shard 0] dns_resolver - Send 1 unavailable.
TRACE 2022-09-27 14:54:49,542 [shard 0] exception - Throw exception at:
0x4cdca94 0x49c0c8d /opt/redpanda/lib/libc++abi.so.1+0x2cff7 0x4ac62ba 0x4ac63a4 0x4a8d96f 0x4a91647 0x4a8ea19 0x49aec11 0x49acd2f 0x1881bb4 0x4d8ea8c /opt/redpanda/lib/libc.so.6+0x27b74 0x187ea2d
--------
seastar::continuation<seastar::internal::promise_base_with_type<unsigned long>, seastar::pollable_fd_state::sendmsg(msghdr*)::$_39, seastar::future<unsigned long> seastar::future<void>::then_impl_nrvo<seastar::pollable_fd_state::sendmsg(msghdr*)::$_39, seastar::future<unsigned long> >(seastar::pollable_fd_state::sendmsg(msghdr*)::$_39&&)::'lambda'(seastar::internal::promise_base_with_type<unsigned long>&&, seastar::pollable_fd_state::sendmsg(msghdr*)::$_39&, seastar::future_state<seastar::internal::monostate>&&), void>
--------
seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::net::posix_udp_channel::send(seastar::socket_address const&, seastar::net::packet)::$_4, seastar::future<void> seastar::future<unsigned long>::then_impl_nrvo<seastar::net::posix_udp_channel::send(seastar::socket_address const&, seastar::net::packet)::$_4, seastar::future<void> >(seastar::net::posix_udp_channel::send(seastar::socket_address const&, seastar::net::packet)::$_4&&)::'lambda'(seastar::internal::promise_base_with_type<void>&&, seastar::net::posix_udp_channel::send(seastar::socket_address const&, seastar::net::packet)::$_4&, seastar::future_state<unsigned long>&&), unsigned long>
--------
seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::net::dns_resolver::impl::do_sendv(int, iovec const*, int)::'lambda'(seastar::future<void>), seastar::futurize<void>::type seastar::future<void>::then_wrapped_nrvo<void, seastar::net::dns_resolver::impl::do_sendv(int, iovec const*, int)::'lambda'(seastar::future<void>)>(seastar::net::dns_resolver::impl::do_sendv(int, iovec const*, int)::'lambda'(seastar::future<void>)&&)::'lambda'(seastar::internal::promise_base_with_type<void>&&, seastar::net::dns_resolver::impl::do_sendv(int, iovec const*, int)::'lambda'(seastar::future<void>)&, seastar::future_state<seastar::internal::monostate>&&), void>
TRACE 2022-09-27 14:54:49,542 [shard 0] exception - Throw exception at:
0x4cdca94 0x49c0c8d /opt/redpanda/lib/libc++abi.so.1+0x2d392 /opt/redpanda/lib/libc++.so.1+0x504e8 0x49d25a1 0x4bee757 0x4bee608 0x4a8d96f 0x4a91647 0x4a8ea19 0x49aec11 0x49acd2f 0x1881bb4 0x4d8ea8c /opt/redpanda/lib/libc.so.6+0x27b74 0x187ea2d
--------
seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::net::dns_resolver::impl::do_sendv(int, iovec const*, int)::'lambda'(seastar::future<void>), seastar::futurize<void>::type seastar::future<void>::then_wrapped_nrvo<void, seastar::net::dns_resolver::impl::do_sendv(int, iovec const*, int)::'lambda'(seastar::future<void>)>(seastar::net::dns_resolver::impl::do_sendv(int, iovec const*, int)::'lambda'(seastar::future<void>)&&)::'lambda'(seastar::internal::promise_base_with_type<void>&&, seastar::net::dns_resolver::impl::do_sendv(int, iovec const*, int)::'lambda'(seastar::future<void>)&, seastar::future_state<seastar::internal::monostate>&&), void>
DEBUG 2022-09-27 14:54:49,542 [shard 0] dns_resolver - Send 1 failed: std::__1::system_error (error system:22, sendmsg: Invalid argument)
TRACE 2022-09-27 14:54:49,543 [shard 0] dns_resolver - Poll sockets
TRACE 2022-09-27 14:54:49,543 [shard 0] dns_resolver - ares_fds: 2
TRACE 2022-09-27 14:54:49,543 [shard 0] dns_resolver - fd 1 r/w
TRACE 2022-09-27 14:54:49,543 [shard 0] dns_resolver - Release socket 1 -> 0
Setup:
- AWS EKS 1.23
- EKS Nodes:
- i3en.xlarge with local nvme provisioned as XFS
- Chart version: pulled from
mainbranch
Any ideas on what could be causing this issue?
JIRA Link: K8S-14
This appears to be a couple of problems. One of them is in core and will be fixed with https://github.com/redpanda-data/redpanda/issues/5842, the other is a chart change to listen on :: instead-of or as-well-as 0.0.0.0. Once the ipv6 issue is addressed in core, we'll get it tested in the helm chart and make any changes that are needed - including regression tests.
Thanks for reporting @BenB196
I'm running into the same issue on Fly.io as their internal network is only IPv6 based. Looking forward to powering the cluster at the edge when this issue is fixed @joejulian 😄
Closing due to age and inactivity. Feel free to re-open.
So. No love for IPv6 then? Would it really be so hard to just provide a listener option which can be set to :: at will?
This issue just hadn't seen activity in 2 years (and I had misread the date of #1062) Sounds like there are still those actively waiting for it's resolution thanks for bringing it to my attention.
Is there any progress or updates on this? We also tried to deploy Redpanda to our EKS IPv6 clusters and met same problems with the readiness and liveness probes unable to connect.