eksup icon indicating copy to clipboard operation
eksup copied to clipboard

Bug: Connection to cluster fails after upgrading from v0.10.0 to v0.11.0

Open chabane-doit opened this issue 5 months ago • 10 comments

Expected Behaviour

Hello,

eksup analyze should successfully connect to the EKS cluster and perform analysis, similar to how it worked in v0.10.0.

Current Behaviour

After upgrading from v0.10.0 to v0.11.0, eksup fails to connect to the cluster despite having a valid and updated kubeconfig file.

Code snippet

aws eks update-kubeconfig --name <my_cluster> --region <my_region>
eksup analyze --cluster <my_cluster> --region <my_region>

Possible Solution

This appears to be a regression introduced in v0.11.0. Possible causes could be:

  • Changes in kubeconfig file detection/parsing logic
  • Modified authentication mechanism
  • Different cluster connection parameters or validation
  • Breaking changes in dependencies handling Kubernetes authentication

Steps to Reproduce

  1. Have a working eksup v0.10.0 installation
  2. Upgrade to eksup v0.11.0
  3. Ensure kubeconfig is properly configured with aws eks update-kubeconfig --name <cluster_name> --region
  4. Run eksup analyze --cluster <cluster_name> --region
  5. Observe the connection error

eksup version

latest

Operating system

macOS arm64

Error output

Error: Unable to connect to cluster. Ensure kubeconfig file is present and updated to connect to the cluster.
      Try: aws eks update-kubeconfig --name

chabane-doit avatar Jul 22 '25 10:07 chabane-doit

we are not going to tolerate AI generated bug reports - you can see the diff for yourself, we have not changed any of the "suggested" AI generated "possible solutions" https://github.com/clowdhaus/eksup/compare/v0.10.0...v0.11.0

bryantbiggs avatar Jul 22 '25 12:07 bryantbiggs

Hello Bryant, Thank you for your response.

The main changes I see:

  • kube = { version = "0.98", default-features = false, features = [ "client", "derive", "rustls-tls" ] }
  • kube = { version = "1.1", default-features = false, features = [ "client", "derive", "aws-lc-rs" ] }

I'm not proficient in Rust, but I will try to rebuild locally with the old version and let you know.

Client Version: v1.33.3 Server Version: v1.32.5-eks

chabane-doit avatar Jul 22 '25 13:07 chabane-doit

I wanted to debug the rust code

let k8s_client = match kube::Client::try_default().await {
    Ok(client) => client,
    Err(e) => {
      bail!(
        "Unable to connect to cluster. Ensure kubeconfig file is present and updated to connect to the cluster.
        Try: aws eks update-kubeconfig --name {cluster_name}
        
        Error: {e}"
      );
    }
  };

The error returned is:

TLS required but no TLS stack selected

chabane-doit avatar Jul 22 '25 14:07 chabane-doit

thanks for looking into this - looks like we can't just switch to aws-lc-rs alone and that we still need to provide rustls-tls

bryantbiggs avatar Jul 22 '25 20:07 bryantbiggs

ok this should be resolved in https://github.com/clowdhaus/eksup/releases/tag/v0.11.1

bryantbiggs avatar Jul 23 '25 00:07 bryantbiggs

Thank you @bryantbiggs

I've encountered a new error, but it doesn't seem to be related to the previous one

RUST_BACKTRACE=full eksup analyze --cluster <my_cluster> --region <my_region> --format json

thread 'main' panicked at /Users/runner/work/eksup/eksup/eksup/src/eks/resources.rs:136:41:
called `Option::unwrap()` on a `None` value
stack backtrace:
   0:        0x100435688 - __mh_execute_header
   1:        0x10029f9cc - __mh_execute_header
   2:        0x10043506c - __mh_execute_header
   3:        0x100435548 - __mh_execute_header
   4:        0x100434cd0 - __mh_execute_header
   5:        0x10045d624 - __mh_execute_header
   6:        0x10045d5bc - __mh_execute_header
   7:        0x10045fcd0 - __mh_execute_header
   8:        0x1005d48d8 - __mh_execute_header
   9:        0x1005d49c4 - __mh_execute_header
  10:        0x1005d4b84 - __mh_execute_header
  11:        0x1000e2920 - __mh_execute_header
  12:        0x1000d6564 - __mh_execute_header
  13:        0x1000f0160 - __mh_execute_header
  14:        0x10011151c - __mh_execute_header
  15:        0x100110258 - __mh_execute_header
  16:        0x1000c3c60 - __mh_execute_header
  17:        0x10011599c - __mh_execute_header

chabane-doit avatar Jul 23 '25 10:07 chabane-doit

alright, let me spin up a few test clusters

bryantbiggs avatar Jul 23 '25 12:07 bryantbiggs

I'm not able to reproduce on my end

Image

bryantbiggs avatar Jul 24 '25 17:07 bryantbiggs

since your error points to here https://github.com/clowdhaus/eksup/blob/596978d88369d18c02dabaec42ee5b277f48b640/eksup/src/eks/resources.rs#L136

do you not have any eks addons (i.e. - none show up in the AWS console for the cluster in question)?

bryantbiggs avatar Jul 24 '25 17:07 bryantbiggs

Actually, I have the following addons:

{
    "addons": [
        "adot",
        "aws-ebs-csi-driver",
        "aws-efs-csi-driver",
        "aws-mountpoint-s3-csi-driver",
        "cert-manager",
        "coredns",
        "eks-node-monitoring-agent",
        "eks-pod-identity-agent",
        "external-dns",
        "kube-proxy",
        "kube-state-metrics",
        "metrics-server",
        "prometheus-node-exporter",
        "vpc-cni"
    ]
}

I have a Docker image that installs eksup:

RUN apt-get update && apt-get install -y cargo

RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs/ | sh -s -- -y
ENV PATH="/root/.cargo/bin:${PATH}"
RUN rustup update
RUN rustup install stable

RUN git clone https://github.com/clowdhaus/eksup && \
    cd eksup && \
    git checkout v0.11.1 && \
    cargo build --release && \
    mv target/release/eksup /usr/local/bin/ && \
    cd .. && \
    rm -rf eksup

It doesn't work locally either (with brew on MacOS)

chabane-doit avatar Jul 24 '25 17:07 chabane-doit