No instance created if I set manager Karpenter by default
I'm creating a cluster in aws with next config. But my cluster has no running nodes.
W0720 15:10:00.715510 31441 validate_cluster.go:232] (will retry): cluster not yet healthy
INSTANCE GROUPS
NAME ROLE MACHINETYPE MIN MAX SUBNETS
control-plane-ap-northeast-1a ControlPlane t3.medium 1 1 ap-northeast-1a
nodes-ap-northeast-1a Node t3.medium 2 2 ap-northeast-1a
nodes-ap-northeast-1d Node t3.medium 1 1 ap-northeast-1d
NODE STATUS
NAME ROLE READY
VALIDATION ERRORS
KIND NAME MESSAGE
dns apiserver Validation Failed
The dns-controller Kubernetes deployment has not updated the Kubernetes cluster's API DNS entry to the correct IP address. The API DNS IP address is the placeholder address that kops creates: 203.0.113.123. Please wait about 5-10 minutes for a control plane node to start, dns-controller to launch, and DNS to propagate. The protokube container and dns-controller deployment logs may contain more diagnostic information. Etcd and the API DNS entries must be updated for a kops Kubernetes cluster to start.
Validation Failed
W0720 15:10:10.719972 31441 validate_cluster.go:232] (will retry): cluster not yet healthy
Error: validation failed: wait time exceeded during validation
Is the way to create first node by default and apply Karpenter scaler after from config?
Or Only way is edit IG after created?
apiVersion: kops.k8s.io/v1alpha2
kind: Cluster
metadata:
creationTimestamp: null
name: k8s.kops.uat.aws.xxx.cloud
spec:
api:
dns: {}
authorization:
rbac: {}
awsLoadBalancerController:
enabled: true
channel: stable
certManager:
enabled: true
cloudProvider: aws
configBase: s3://sf-kops-state-store/k8s.kops.uat.aws.xxx.cloud
dnsZone: kops.uat.aws.xxx.cloud
etcdClusters:
- cpuRequest: 200m
etcdMembers:
- encryptedVolume: true
instanceGroup: control-plane-ap-northeast-1a
name: a
manager:
backupRetentionDays: 90
memoryRequest: 100Mi
name: main
- cpuRequest: 100m
etcdMembers:
- encryptedVolume: true
instanceGroup: control-plane-ap-northeast-1a
name: a
manager:
backupRetentionDays: 90
memoryRequest: 100Mi
name: events
externalDns:
watchIngress: true
iam:
useServiceAccountExternalPermissions: true
allowContainerRegistry: true
legacy: false
karpenter:
enabled: true
kubeProxy:
enabled: false
kubelet:
anonymousAuth: false
kubernetesApiAccess:
- 0.0.0.0/0
- ::/0
kubernetesVersion: 1.27.3
masterPublicName: api.k8s.kops.uat.aws.xxx.cloud
metricsServer:
enabled: true
nodeProblemDetector:
enabled: true
memoryRequest: 32Mi
cpuRequest: 10m
networkCIDR: 172.20.0.0/16
networking:
amazonvpc: {}
nonMasqueradeCIDR: 100.64.0.0/10
podIdentityWebhook:
enabled: true
snapshotController:
enabled: true
cloudConfig:
awsEBSCSIDriver:
enabled: true
sshAccess:
- 0.0.0.0/0
- ::/0
subnets:
- cidr: 172.20.32.0/19
name: ap-northeast-1a
type: Public
zone: ap-northeast-1a
- cidr: 172.20.64.0/19
name: ap-northeast-1d
type: Public
zone: ap-northeast-1d
topology:
dns:
type: Public
masters: public
nodes: public
---
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: null
labels:
kops.k8s.io/cluster: k8s.kops.uat.aws.xxx.cloud
name: control-plane-ap-northeast-1a
spec:
manager: Karpenter
image: ami-05ffd9ad4ddd0d6e2
machineType: t3.medium
maxSize: 1
minSize: 1
role: Master
subnets:
- ap-northeast-1a
---
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: null
labels:
kops.k8s.io/cluster: k8s.kops.uat.aws.xxx.cloud
name: nodes-ap-northeast-1a
spec:
manager: Karpenter
image: ami-05ffd9ad4ddd0d6e2
machineType: t3.medium
maxSize: 2
minSize: 2
role: Node
subnets:
- ap-northeast-1a
---
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: null
labels:
kops.k8s.io/cluster: k8s.kops.uat.aws.xxx.cloud
name: nodes-ap-northeast-1d
spec:
manager: Karpenter
image: ami-05ffd9ad4ddd0d6e2
machineType: t3.medium
maxSize: 1
minSize: 1
role: Node
subnets:
- ap-northeast-1d
When I creating with CLI only control pane Instance group created but no for nodes and cluster validate failed
#!/bin/bash
source .env
kops create cluster \
--zones ${ZONES} \
--master-count=1 \
--node-count=3 \
--control-plane-image ${AMI} \
--node-image ${AMI} \
--node-size ${NODE_SIZE} \
--master-size ${CONTROL_PLANE_SIZE} \
--instance-manager=karpenter \
--discovery-store=s3://sf-k8s-oidc-store \
--networking=amazonvpc \
--dns-zone ${DNS} \
--yes
I'm seeing something similar with the attached config.
My cluster only comes up if I drop:
spec:
externalDns:
watchIngress: true
Cluster and IG config YAML:
apiVersion: kops.k8s.io/v1alpha2
kind: Cluster
metadata:
name: prod.cluster-foo.com
spec:
api:
loadBalancer:
class: Network
type: Public
authorization:
rbac: {}
awsLoadBalancerController:
enabled: true
certManager:
defaultIssuer: lets-encrypt
enabled: true
channel: stable
cloudProvider: aws
configBase: s3://prod-cluster-foo-com-state-store/prod.cluster-foo.com
etcdClusters:
- cpuRequest: 200m
etcdMembers:
- encryptedVolume: true
instanceGroup: control-plane-us-east-2a
name: a
manager:
backupRetentionDays: 90
memoryRequest: 100Mi
name: main
- cpuRequest: 100m
etcdMembers:
- encryptedVolume: true
instanceGroup: control-plane-us-east-2a
name: a
manager:
backupRetentionDays: 90
memoryRequest: 100Mi
name: events
externalDns:
watchIngress: true
iam:
allowContainerRegistry: true
legacy: false
useServiceAccountExternalPermissions: true
karpenter:
enabled: true
kubeProxy:
enabled: false
kubelet:
anonymousAuth: false
kubernetesApiAccess:
- 0.0.0.0/0
- ::/0
kubernetesVersion: 1.27.3
masterPublicName: api.prod.cluster-foo.com
networkCIDR: 172.20.0.0/16
networking:
cilium:
enableNodePort: true
nonMasqueradeCIDR: 100.64.0.0/10
serviceAccountIssuerDiscovery:
discoveryStore: s3://prod-cluster-foo-com-oidc-store/prod.cluster-foo.com/discovery/prod.cluster-foo.com
enableAWSOIDCProvider: true
sshAccess:
- 0.0.0.0/0
- ::/0
subnets:
- cidr: 172.20.32.0/19
name: us-east-2a
type: Public
zone: us-east-2a
topology:
dns:
type: Public
masters: public
nodes: public
---
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
labels:
kops.k8s.io/cluster: prod.cluster-foo.com
name: control-plane-us-east-2a
spec:
image: 099720109477/ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-20230608
machineType: t3.medium
maxSize: 1
minSize: 1
role: Master
subnets:
- us-east-2a
---
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
labels:
kops.k8s.io/cluster: prod.cluster-foo.com
name: nodes
spec:
image: 099720109477/ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-20230608
manager: Karpenter
role: Node
subnets:
- us-east-2a
It turns out that if the externalDns key is present, then the provider must be specified as well!
spec:
externalDns:
provider: dns-controller
watchIngress: true
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Reopen this issue with
/reopen - Mark this issue as fresh with
/remove-lifecycle rotten - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied- After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied- After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closedYou can:
- Reopen this issue with
/reopen- Mark this issue as fresh with
/remove-lifecycle rotten- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.