cluster-api-provider-aws
cluster-api-provider-aws copied to clipboard
Tasks for adopting CAPI's Server Side Apply
This issue is tracking the list of tasks to make CAPI's SSA (Server Side Apply) to work with CAPA.
Why do we need this?
CAPA's spec.network.subnets is coauthored by CAPI and CAPA controllers when using ClusterClass. To properly manage these coauthoring slices and prevent them continuously getting patched by both controllers, CAPI is now using Server-Side Apply.
- [x] Issue: https://github.com/kubernetes-sigs/cluster-api/issues/6320
- [x] Solution: https://github.com/kubernetes-sigs/cluster-api/pull/6495
Changes Required in CAPA
- [ ] https://github.com/kubernetes-sigs/cluster-api-provider-aws/issues/3531
The following issues require v1beta2 API version bump as a pre-requiste.
- [ ] https://github.com/kubernetes-sigs/cluster-api-provider-aws/issues/3528
- [ ] https://github.com/kubernetes-sigs/cluster-api-provider-aws/issues/3536
Other Issues to Follow
- [x] https://github.com/kubernetes-sigs/controller-tools/pull/692
- This is needed to properly generate CRD manifests with the list markers. Currently, we are using a hack to overcome this issue.
- [x] https://github.com/kubernetes-sigs/cluster-api/issues/6650
- There is an issue with controller metadata in logging. The log prints out a wrong controller type and kind.
CAPA issues that will be resolved
- [ ] https://github.com/kubernetes-sigs/cluster-api-provider-aws/issues/3399
- [ ] https://github.com/kubernetes-sigs/cluster-api-provider-aws/pull/3397
PoC
While waiting for controller-tools and listMapKey issues to be worked on, did an Initial PoC with test purpose CRDs. This required some hacks so the result needs to be confirmed when all the tasks listed in Changes Required in CAPA section is completed.
Hacks
- Used
[]SubnetSpec, a slice, as a type for Subnets for CRD manifest generation.
// +optional
// +listType=map
// +listMapKey=id
Subnets []SubnetSpec `json:"subnets,omitempty"`
- Made
subnet.idas a required field in CRD to use as a listMapKey.
Scenario: BYO Infra Case
AWSClusterTemplate in ClusterClass
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AWSClusterTemplate
metadata:
name: um-ec2-clusterclass-v1
spec:
template:
spec:
network:
vpc:
id: vpc-0e38e0a4712b9b316
subnets:
- id: subnet-0588d98dd78abf69b
availabilityZone: us-west-1c
isPublic: true
- id: subnet-0454fcf4f534539df
availabilityZone: us-west-1c
region: REPLACEME
sshKeyName: REPLACEME
Findings
- Observed that AWSCluster
.spec.network.subnetsvalue doesn't oscillate. Before the SSA, there were constant patching from both CAPA and CAPI controllers and the field constantly changed as observed in here - Managed field shows both CAPI and CAPA controllers own parts of
.spec.network.subnets
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AWSCluster
metadata:
...
managedFields:
- apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
f:cluster.x-k8s.io/cloned-from-groupkind: {}
f:cluster.x-k8s.io/cloned-from-name: {}
f:labels:
f:cluster.x-k8s.io/cluster-name: {}
f:topology.cluster.x-k8s.io/owned: {}
f:spec:
f:bastion:
f:allowedCIDRBlocks: {}
f:enabled: {}
f:controlPlaneLoadBalancer:
f:crossZoneLoadBalancing: {}
f:scheme: {}
f:identityRef:
f:kind: {}
f:name: {}
f:network:
f:cni:
f:cniIngressRules: {}
f:subnets: ⬅️
k:{"id":"subnet-0454fcf4f534539df"}:
.: {}
f:availabilityZone: {}
f:id: {}
f:isPublic: {}
k:{"id":"subnet-0588d98dd78abf69b"}:
.: {}
f:availabilityZone: {}
f:id: {}
f:isPublic: {}
f:vpc:
f:availabilityZoneSelection: {}
f:availabilityZoneUsageLimit: {}
f:id: {}
f:region: {}
f:sshKeyName: {}
manager: capi-topology ⬅️
operation: Apply
time: "2022-06-15T12:54:06Z"
- apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:finalizers:
.: {}
v:"awscluster.infrastructure.cluster.x-k8s.io": {}
f:spec:
f:controlPlaneEndpoint:
f:host: {}
f:port: {}
f:network:
f:subnets: ⬅️
k:{"id":"subnet-0454fcf4f534539df"}:
f:cidrBlock: {}
f:routeTableId: {}
f:tags:
.: {}
f:Name: {}
f:kubernetes.io/cluster/um-ec2-cc-cluster: {}
f:kubernetes.io/cluster/um-ec2-cluster: {}
f:kubernetes.io/role/internal-elb: {}
k:{"id":"subnet-0588d98dd78abf69b"}:
f:cidrBlock: {}
f:natGatewayId: {}
f:routeTableId: {}
f:tags:
.: {}
f:Name: {}
f:kubernetes.io/cluster/um-ec2-cc-cluster: {}
f:kubernetes.io/cluster/um-ec2-cluster: {}
f:kubernetes.io/role/elb: {}
f:vpc:
f:cidrBlock: {}
f:tags:
.: {}
f:Name: {}
manager: cluster-api-provider-aws-controller ⬅️
operation: Update
time: "2022-06-15T12:55:43Z"
...
/triage accepted /priority important-soon
OCI provider fix for the problem: https://github.com/oracle/cluster-api-provider-oci/pull/116