kops
kops copied to clipboard
Validating kubernetes cluster with kops on AWS
Hi, I am facing following error while validating the cluster
Error: validation failed: unexpected error during validation: error listing nodes: Get "https://api-test-k8s-local-7afed8-88de7399e241b2e1.elb.us-east-2.amazonaws.com/api/v1/nodes": dial tcp 18.217.140.139:443: i/o timeout
Please try https://kops.sigs.k8s.io/operations/troubleshoot/.
@bilalmushtaq514 it seems like you're failing to hit the k8s api after the cluster was set up. The connection timed out error makes me think that it's either a security group rule missing, or a config issue.
One thing that comes to mind to begin with, is that you might've set up your cluster with an internal load balancer, or that maybe you've locked down the access to the api for certain IPs.
However, it's hard to tell without getting the full context of your cluster.
Would you mind sharing your cluster spec (you can do that by running kops get cluster --state <S3_BUCKET_NAME> --name <CLUSTER_NAME> -o yaml
and share it here? This could shed a bit more light on what could be the root cause.
Thanks!
@bilalmushtaq514 it seems like you're failing to hit the k8s api after the cluster was set up. The connection timed out error makes me think that it's either a security group rule missing, or a config issue. One thing that comes to mind to begin with, is that you might've set up your cluster with an internal load balancer, or that maybe you've locked down the access to the api for certain IPs. However, it's hard to tell without getting the full context of your cluster. Would you mind sharing your cluster spec (you can do that by running
kops get cluster --state <S3_BUCKET_NAME> --name <CLUSTER_NAME> -o yaml
and share it here? This could shed a bit more light on what could be the root cause. Thanks!
apiVersion: kops.k8s.io/v1alpha2
kind: Cluster
metadata:
creationTimestamp: "2023-08-27T15:43:34Z"
spec:
api:
dns: {}
authorization:
rbac: {}
channel: stable
cloudProvider: aws
configBase: s3://kops-state-21/kubevpro.grooply.online
dnsZone: kubevpro.grooply.online
etcdClusters:
- cpuRequest: 200m
etcdMembers:
- encryptedVolume: true
instanceGroup: control-plane-us-east-1a
name: a
manager:
backupRetentionDays: 90
memoryRequest: 100Mi
name: main
- cpuRequest: 100m
etcdMembers:
- encryptedVolume: true
instanceGroup: control-plane-us-east-1a
name: a
manager:
backupRetentionDays: 90
memoryRequest: 100Mi
name: events
iam:
allowContainerRegistry: true
legacy: false
kubeProxy:
enabled: false
kubelet:
anonymousAuth: false
kubernetesApiAccess:
- 0.0.0.0/0
- ::/0
kubernetesVersion: 1.27.5
masterPublicName: api.kubevpro.grooply.online
networkCIDR: 172.20.0.0/16
networking:
cilium:
enableNodePort: true
nonMasqueradeCIDR: 100.64.0.0/10
sshAccess:
- 0.0.0.0/0
- ::/0
subnets:
- cidr: 172.20.32.0/19
name: us-east-1a
type: Public
zone: us-east-1a
- cidr: 172.20.64.0/19
name: us-east-1b
type: Public
zone: us-east-1b
topology:
dns:
type: Public
masters: public
nodes: public
@bilalmushtaq514 it seems like you're failing to hit the k8s api after the cluster was set up. The connection timed out error makes me think that it's either a security group rule missing, or a config issue. One thing that comes to mind to begin with, is that you might've set up your cluster with an internal load balancer, or that maybe you've locked down the access to the api for certain IPs. However, it's hard to tell without getting the full context of your cluster. Would you mind sharing your cluster spec (you can do that by running
kops get cluster --state <S3_BUCKET_NAME> --name <CLUSTER_NAME> -o yaml
and share it here? This could shed a bit more light on what could be the root cause. Thanks!apiVersion: kops.k8s.io/v1alpha2 kind: Cluster metadata: creationTimestamp: "2023-08-27T15:43:34Z"
spec: api: dns: {} authorization: rbac: {} channel: stable cloudProvider: aws configBase: s3://kops-state-21/kubevpro.grooply.online dnsZone: kubevpro.grooply.online etcdClusters:
cpuRequest: 200m etcdMembers:
- encryptedVolume: true instanceGroup: control-plane-us-east-1a name: a manager: backupRetentionDays: 90 memoryRequest: 100Mi name: main
cpuRequest: 100m etcdMembers:
- encryptedVolume: true instanceGroup: control-plane-us-east-1a name: a manager: backupRetentionDays: 90 memoryRequest: 100Mi name: events iam: allowContainerRegistry: true legacy: false kubeProxy: enabled: false kubelet: anonymousAuth: false kubernetesApiAccess:
0.0.0.0/0
::/0 kubernetesVersion: 1.27.5 masterPublicName: api.kubevpro.grooply.online networkCIDR: 172.20.0.0/16 networking: cilium: enableNodePort: true nonMasqueradeCIDR: 100.64.0.0/10 sshAccess:
0.0.0.0/0
::/0 subnets:
cidr: 172.20.32.0/19 name: us-east-1a type: Public zone: us-east-1a
cidr: 172.20.64.0/19 name: us-east-1b type: Public zone: us-east-1b topology: dns: type: Public masters: public nodes: public
It seems like the yaml structure is a bit messed up with the markdown formatting, so a bit harder to understand what goes where.
Would you mind wrapping it in a fenced code block and using the yaml
syntax highlighting?
Thanks!
/kind support
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale
- Close this issue with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten
- Close this issue with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied- After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied- After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closedYou can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.