kops version 1.30.4 still has issue https://github.com/kubernetes/kops/pull/17161
/kind bug
**1. 1.30.4
**2. 1.30.8
**3. AWS
**4. kops update cluster --yes --lifecycle-overrides IAMRole=ExistsAndWarnIfChanges,IAMRolePolicy=ExistsAndWarnIfChanges,IAMInstanceProfileRole=ExistsAndWarnIfChanges
5. What happened after the commands executed? SDK 2025/01/31 08:40:37 DEBUG request failed with unretryable error https response error StatusCode: 403, RequestID: 88d417e4-f17d-4401-ad21-86965447eb75, api error AccessDenied: User: arn:aws:sts:::assumed-role/kops-admin.test.io/aws-go-sdk-1738312837464117679 is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam:::role/kops-admin.test.io Error: error determining default DNS zone: error querying zones: error listing hosted zones: operation error Route 53: ListHostedZones, get identity: get credentials: failed to refresh cached credentials, operation error STS: AssumeRole, https response error StatusCode: 403, RequestID: 88d417e4-f17d-4401-ad21-86965447eb75, api error AccessDenied: User: arn:aws:sts:::assumed-role/kops-admin.test.io/aws-go-sdk-1738312837464117679 is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam:::role/kops-admin.test.io
**6. above issue should be resolved with kops 1.30.4
7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.
kind: Cluster
metadata:
creationTimestamp: null
spec:
additionalSans:
api:
loadBalancer:
additionalSecurityGroups:
class: Network
type: Internal
assets:
authentication:
aws:
backendMode: CRD
authorization:
rbac: {}
certManager:
enabled: true
channel: stable
cloudConfig:
awsEBSCSIDriver:
enabled: true
cloudControllerManager:
cloudProvider: aws
image: repo_name/provider-aws/cloud-controller-manager:v1.29.6
cloudLabels:
Application: kubernetes
Product: kubernetes
Terraform: "true"
department: ips
environment: dev
stage: feature
cloudProvider: aws
containerd:
logLevel: warn
nvidiaGPU:
enabled: true
runc:
version: 1.1.12
version: 1.7.16
etcdClusters:
- etcdMembers:
- encryptedVolume: true
instanceGroup: master-eu-central-1a
name: a
- encryptedVolume: true
instanceGroup: master-eu-central-1b
name: b
- encryptedVolume: true
instanceGroup: master-eu-central-1c
name: c
manager:
env:
- name: ETCD_LISTEN_METRICS_URLS
value: http://0.0.0.0:8081
- name: ETCD_METRICS
value: extensive
- name: ETCD_MANAGER_HOURLY_BACKUPS_RETENTION
value: 7d
- name: ETCD_MANAGER_DAILY_BACKUPS_RETENTION
value: 14d
logLevel: 1
name: main
- etcdMembers:
- encryptedVolume: true
instanceGroup: master-eu-central-1a
name: a
- encryptedVolume: true
instanceGroup: master-eu-central-1b
name: b
- encryptedVolume: true
instanceGroup: master-eu-central-1c
name: c
manager:
logLevel: 1
name: events
fileAssets:
- content: "apiVersion: audit.k8s.io/v1 # This is required.\nkind: Policy\n# Don't
generate audit events for all requests in RequestReceived stage.\nomitStages:\n
\ - \"RequestReceived\"\nrules:\n # Log pod changes at RequestResponse level\n
\ - level: RequestResponse\n verbs: [\"create\", \"patch\", \"update\", \"delete\",
\"deletecollection\"]\n resources:\n - group: \"\"\n # Resource \"pods\"
doesn't match requests to any subresource of pods,\n # which is consistent
with the RBAC policy.\n resources: [\"pods\"]\n\n # Don't log requests
to a configmap called \"controller-leader\"\n - level: None\n resources:\n
\ - group: \"\"\n resources: [\"configmaps\"]\n resourceNames: [\"controller-leader\"]\n
\ \n # Log configmap and secret changes at the Metadata level.\n - level:
Metadata\n resources:\n - group: \"\" # core API group\n resources:
[\"secrets\", \"configmaps\"]\n\n # Do not log from kube-system and nodes accounts\n
\ - level: None\n userGroups:\n - system:serviceaccounts:kube-system\n
\ - system:nodes\n\n # Do not log from some system users\n - level: None\n
\ users:\n - system:apiserver\n - system:kube-proxy\n - system:kube-scheduler\n
\ - system:kube-controller-manager\n - system:node\n - system:serviceaccount:core-stack:cluster-autoscaler-chart-aws-cluster-autoscaler\n
\ - system:serviceaccount:core-stack:external-dns\n - system:serviceaccount:istio-system:istiod-service-account\n
\ - system:volume-scheduler\n\n # Don't log these read-only URLs.\n - level:
None\n nonResourceURLs:\n - \"/healthz*\"\n - \"/version\"\n - \"/swagger*\"\n
\ - \"/logs\"\n - \"/metrics\"\n\n # Don't log authenticated requests
to certain non-resource URL paths.\n - level: None\n userGroups: [\"system:authenticated\"]\n
\ nonResourceURLs:\n - \"/api*\" # Wildcard matching.\n - \"/version\"\n\n
\ # Log on Metadata from gatekeeper accounts\n - level: Metadata\n userGroups:\n
\ - system:serviceaccounts:gatekeeper-system\n\n # Log All changes at RequestResponse
level\n - level: RequestResponse\n verbs: [\"create\", \"patch\", \"update\",
\"delete\", \"deletecollection\"]\n\n # Log all other resources in core and
extensions at the Request level.\n - level: Request\n resources:\n -
group: \"\" # core API group\n - group: \"extensions\" # Version of group
should NOT be included.\n\n # A catch-all rule to log all other requests at
the Metadata level.\n - level: Metadata\n # Long-running requests like watches
that fall under this rule will not\n # generate an audit event in RequestReceived.\n
\ omitStages:\n - \"RequestReceived\"\n"
name: audit-policy-config
path: /srv/kubernetes/kube-apiserver/audit-policy-config.yaml
roles:
- ControlPlane
iam:
allowContainerRegistry: true
legacy: false
useServiceAccountExternalPermissions: false
kubeAPIServer:
auditLogMaxAge: 10
auditLogMaxBackups: 1
auditLogMaxSize: 100
auditLogPath: /var/log/kube-apiserver-audit.log
auditPolicyFile: /srv/kubernetes/kube-apiserver/audit-policy-config.yaml
cloudProvider: external
oidcClientID: kubernetes
oidcGroupsClaim: groups
kubeDNS:
nodeLocalDNS:
enabled: true
forwardToKubeDNS: true
provider: CoreDNS
tolerations:
- effect: NoSchedule
key: component
operator: Equal
value: core
- key: CriticalAddonsOnly
operator: Exists
kubeProxy:
metricsBindAddress: 0.0.0.0
kubelet:
anonymousAuth: false
authenticationTokenWebhook: true
authorizationMode: Webhook
kubernetesVersion: 1.30.8
networkCIDR: 10.151.24.0/21
networkID: vpc-099f946314a45d38c
networking:
cni: {}
nodeTerminationHandler:
cpuRequest: 10m
enableRebalanceDraining: false
enableRebalanceMonitoring: false
enableSQSTerminationDraining: false
enabled: true
prometheusEnable: true
nonMasqueradeCIDR: 100.64.0.0/10
serviceAccountIssuerDiscovery:
discoveryStore: s3://oidc-f0122fa2d-987360431102
enableAWSOIDCProvider: true
topology:
dns:
type: Private
8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.
9. Anything else do we need to know?
Same
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Reopen this issue with
/reopen - Mark this issue as fresh with
/remove-lifecycle rotten - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied- After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied- After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closedYou can:
- Reopen this issue with
/reopen- Mark this issue as fresh with
/remove-lifecycle rotten- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.
FWIW, when using IRSA and kops 1.31.1, I get an error where kops is trying to assume the role configured into AWS environment variables again. This was supported by default until 2023, and can be supported by adjusting the role policy, but it would be nice not to have to.