helm-charts icon indicating copy to clipboard operation
helm-charts copied to clipboard

Cluster fails to form on AWS EKS IPv6 deployment

Open BenB196 opened this issue 3 years ago • 5 comments

Hi,

I'm trying to deploy this helm chart to an AWS EKS IPv6 deployment, but am running into a weird issue.

I run the following command:

KUBECONFIG=/home/DOMAIN/user/.kube/kubeconfig helm upgrade --install redpanda redpanda -n redpanda-ns --create-namespace --set resources.cpu.cores=3 --set resources.enable_memory_locking=true --set resources.memory.container.max="28Gi" --set storage.hostPath="/mnt/data" --set statefulset.podAntiAffinity.type="hard" --set 'statefulset.nodeSelector.company\.com/usage=redpanda' --set storage.persistentVolume.enabled=false --set logging.logLevel="trace"

I see all 3 Red Panda pods come up, but the cluster never forms, via trace logging I see the following error:

TRACE 2022-09-27 14:54:49,542 [shard 0] dns_resolver - Poll sockets
TRACE 2022-09-27 14:54:49,542 [shard 0] dns_resolver - ares_fds: 2
TRACE 2022-09-27 14:54:49,542 [shard 0] dns_resolver - fd 1 r/w
TRACE 2022-09-27 14:54:49,542 [shard 0] dns_resolver - Send 1(2)
TRACE 2022-09-27 14:54:49,542 [shard 0] dns_resolver - Send 1 unavailable.
TRACE 2022-09-27 14:54:49,542 [shard 0] exception - Throw exception at:
0x4cdca94 0x49c0c8d /opt/redpanda/lib/libc++abi.so.1+0x2cff7 0x4ac62ba 0x4ac63a4 0x4a8d96f 0x4a91647 0x4a8ea19 0x49aec11 0x49acd2f 0x1881bb4 0x4d8ea8c /opt/redpanda/lib/libc.so.6+0x27b74 0x187ea2d
--------
seastar::continuation<seastar::internal::promise_base_with_type<unsigned long>, seastar::pollable_fd_state::sendmsg(msghdr*)::$_39, seastar::future<unsigned long> seastar::future<void>::then_impl_nrvo<seastar::pollable_fd_state::sendmsg(msghdr*)::$_39, seastar::future<unsigned long> >(seastar::pollable_fd_state::sendmsg(msghdr*)::$_39&&)::'lambda'(seastar::internal::promise_base_with_type<unsigned long>&&, seastar::pollable_fd_state::sendmsg(msghdr*)::$_39&, seastar::future_state<seastar::internal::monostate>&&), void>
--------
seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::net::posix_udp_channel::send(seastar::socket_address const&, seastar::net::packet)::$_4, seastar::future<void> seastar::future<unsigned long>::then_impl_nrvo<seastar::net::posix_udp_channel::send(seastar::socket_address const&, seastar::net::packet)::$_4, seastar::future<void> >(seastar::net::posix_udp_channel::send(seastar::socket_address const&, seastar::net::packet)::$_4&&)::'lambda'(seastar::internal::promise_base_with_type<void>&&, seastar::net::posix_udp_channel::send(seastar::socket_address const&, seastar::net::packet)::$_4&, seastar::future_state<unsigned long>&&), unsigned long>
--------
seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::net::dns_resolver::impl::do_sendv(int, iovec const*, int)::'lambda'(seastar::future<void>), seastar::futurize<void>::type seastar::future<void>::then_wrapped_nrvo<void, seastar::net::dns_resolver::impl::do_sendv(int, iovec const*, int)::'lambda'(seastar::future<void>)>(seastar::net::dns_resolver::impl::do_sendv(int, iovec const*, int)::'lambda'(seastar::future<void>)&&)::'lambda'(seastar::internal::promise_base_with_type<void>&&, seastar::net::dns_resolver::impl::do_sendv(int, iovec const*, int)::'lambda'(seastar::future<void>)&, seastar::future_state<seastar::internal::monostate>&&), void>
TRACE 2022-09-27 14:54:49,542 [shard 0] exception - Throw exception at:
0x4cdca94 0x49c0c8d /opt/redpanda/lib/libc++abi.so.1+0x2d392 /opt/redpanda/lib/libc++.so.1+0x504e8 0x49d25a1 0x4bee757 0x4bee608 0x4a8d96f 0x4a91647 0x4a8ea19 0x49aec11 0x49acd2f 0x1881bb4 0x4d8ea8c /opt/redpanda/lib/libc.so.6+0x27b74 0x187ea2d
--------
seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::net::dns_resolver::impl::do_sendv(int, iovec const*, int)::'lambda'(seastar::future<void>), seastar::futurize<void>::type seastar::future<void>::then_wrapped_nrvo<void, seastar::net::dns_resolver::impl::do_sendv(int, iovec const*, int)::'lambda'(seastar::future<void>)>(seastar::net::dns_resolver::impl::do_sendv(int, iovec const*, int)::'lambda'(seastar::future<void>)&&)::'lambda'(seastar::internal::promise_base_with_type<void>&&, seastar::net::dns_resolver::impl::do_sendv(int, iovec const*, int)::'lambda'(seastar::future<void>)&, seastar::future_state<seastar::internal::monostate>&&), void>
DEBUG 2022-09-27 14:54:49,542 [shard 0] dns_resolver - Send 1 failed: std::__1::system_error (error system:22, sendmsg: Invalid argument)
TRACE 2022-09-27 14:54:49,543 [shard 0] dns_resolver - Poll sockets
TRACE 2022-09-27 14:54:49,543 [shard 0] dns_resolver - ares_fds: 2
TRACE 2022-09-27 14:54:49,543 [shard 0] dns_resolver - fd 1 r/w
TRACE 2022-09-27 14:54:49,543 [shard 0] dns_resolver - Release socket 1 -> 0

Setup:

  • AWS EKS 1.23
  • EKS Nodes:
    • i3en.xlarge with local nvme provisioned as XFS
  • Chart version: pulled from main branch

Any ideas on what could be causing this issue?

JIRA Link: K8S-14

BenB196 avatar Sep 27 '22 15:09 BenB196

This appears to be a couple of problems. One of them is in core and will be fixed with https://github.com/redpanda-data/redpanda/issues/5842, the other is a chart change to listen on :: instead-of or as-well-as 0.0.0.0. Once the ipv6 issue is addressed in core, we'll get it tested in the helm chart and make any changes that are needed - including regression tests.

joejulian avatar Sep 30 '22 17:09 joejulian

Thanks for reporting @BenB196

I'm running into the same issue on Fly.io as their internal network is only IPv6 based. Looking forward to powering the cluster at the edge when this issue is fixed @joejulian 😄

rupurt avatar Oct 15 '22 18:10 rupurt

Closing due to age and inactivity. Feel free to re-open.

chrisseto avatar Aug 16 '24 15:08 chrisseto

So. No love for IPv6 then? Would it really be so hard to just provide a listener option which can be set to :: at will?

YannikSc avatar Aug 21 '24 08:08 YannikSc

This issue just hadn't seen activity in 2 years (and I had misread the date of #1062) Sounds like there are still those actively waiting for it's resolution thanks for bringing it to my attention.

chrisseto avatar Aug 23 '24 18:08 chrisseto

Is there any progress or updates on this? We also tried to deploy Redpanda to our EKS IPv6 clusters and met same problems with the readiness and liveness probes unable to connect.

alexdaima avatar Nov 17 '25 10:11 alexdaima