cluster-api-provider-aws icon indicating copy to clipboard operation
cluster-api-provider-aws copied to clipboard

Tasks for adopting CAPI's Server Side Apply

Open pydctw opened this issue 3 years ago • 3 comments
trafficstars

This issue is tracking the list of tasks to make CAPI's SSA (Server Side Apply) to work with CAPA.

Why do we need this?

CAPA's spec.network.subnets is coauthored by CAPI and CAPA controllers when using ClusterClass. To properly manage these coauthoring slices and prevent them continuously getting patched by both controllers, CAPI is now using Server-Side Apply.

  • [x] Issue: https://github.com/kubernetes-sigs/cluster-api/issues/6320
  • [x] Solution: https://github.com/kubernetes-sigs/cluster-api/pull/6495

Changes Required in CAPA

  • [ ] https://github.com/kubernetes-sigs/cluster-api-provider-aws/issues/3531

The following issues require v1beta2 API version bump as a pre-requiste.

  • [ ] https://github.com/kubernetes-sigs/cluster-api-provider-aws/issues/3528
  • [ ] https://github.com/kubernetes-sigs/cluster-api-provider-aws/issues/3536

Other Issues to Follow

  • [x] https://github.com/kubernetes-sigs/controller-tools/pull/692
    • This is needed to properly generate CRD manifests with the list markers. Currently, we are using a hack to overcome this issue.
  • [x] https://github.com/kubernetes-sigs/cluster-api/issues/6650
    • There is an issue with controller metadata in logging. The log prints out a wrong controller type and kind.

CAPA issues that will be resolved

  • [ ] https://github.com/kubernetes-sigs/cluster-api-provider-aws/issues/3399
  • [ ] https://github.com/kubernetes-sigs/cluster-api-provider-aws/pull/3397

pydctw avatar Jun 15 '22 11:06 pydctw

PoC

While waiting for controller-tools and listMapKey issues to be worked on, did an Initial PoC with test purpose CRDs. This required some hacks so the result needs to be confirmed when all the tasks listed in Changes Required in CAPA section is completed.

Hacks

  • Used []SubnetSpec, a slice, as a type for Subnets for CRD manifest generation.
// +optional
// +listType=map
// +listMapKey=id
Subnets []SubnetSpec `json:"subnets,omitempty"`
  • Made subnet.id as a required field in CRD to use as a listMapKey.

Scenario: BYO Infra Case

AWSClusterTemplate in ClusterClass

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AWSClusterTemplate
metadata:
  name: um-ec2-clusterclass-v1
spec:
  template:
    spec:
      network:
        vpc:
          id: vpc-0e38e0a4712b9b316
        subnets:
          - id: subnet-0588d98dd78abf69b
            availabilityZone: us-west-1c
            isPublic: true
          - id: subnet-0454fcf4f534539df
            availabilityZone: us-west-1c
      region: REPLACEME
      sshKeyName: REPLACEME

Findings

  • Observed that AWSCluster .spec.network.subnets value doesn't oscillate. Before the SSA, there were constant patching from both CAPA and CAPI controllers and the field constantly changed as observed in here
  • Managed field shows both CAPI and CAPA controllers own parts of .spec.network.subnets
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AWSCluster
metadata:
  ...
  managedFields:
  - apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          f:cluster.x-k8s.io/cloned-from-groupkind: {}
          f:cluster.x-k8s.io/cloned-from-name: {}
        f:labels:
          f:cluster.x-k8s.io/cluster-name: {}
          f:topology.cluster.x-k8s.io/owned: {}
      f:spec:
        f:bastion:
          f:allowedCIDRBlocks: {}
          f:enabled: {}
        f:controlPlaneLoadBalancer:
          f:crossZoneLoadBalancing: {}
          f:scheme: {}
        f:identityRef:
          f:kind: {}
          f:name: {}
        f:network:
          f:cni:
            f:cniIngressRules: {}
          f:subnets: ⬅️
            k:{"id":"subnet-0454fcf4f534539df"}:
              .: {}
              f:availabilityZone: {}
              f:id: {}
              f:isPublic: {}
            k:{"id":"subnet-0588d98dd78abf69b"}:
              .: {}
              f:availabilityZone: {}
              f:id: {}
              f:isPublic: {}
          f:vpc:
            f:availabilityZoneSelection: {}
            f:availabilityZoneUsageLimit: {}
            f:id: {}
        f:region: {}
        f:sshKeyName: {}
    manager: capi-topology ⬅️
    operation: Apply
    time: "2022-06-15T12:54:06Z"
  - apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:finalizers:
          .: {}
          v:"awscluster.infrastructure.cluster.x-k8s.io": {}
      f:spec:
        f:controlPlaneEndpoint:
          f:host: {}
          f:port: {}
        f:network:
          f:subnets: ⬅️
            k:{"id":"subnet-0454fcf4f534539df"}:
              f:cidrBlock: {}
              f:routeTableId: {}
              f:tags:
                .: {}
                f:Name: {}
                f:kubernetes.io/cluster/um-ec2-cc-cluster: {}
                f:kubernetes.io/cluster/um-ec2-cluster: {}
                f:kubernetes.io/role/internal-elb: {}
            k:{"id":"subnet-0588d98dd78abf69b"}:
              f:cidrBlock: {}
              f:natGatewayId: {}
              f:routeTableId: {}
              f:tags:
                .: {}
                f:Name: {}
                f:kubernetes.io/cluster/um-ec2-cc-cluster: {}
                f:kubernetes.io/cluster/um-ec2-cluster: {}
                f:kubernetes.io/role/elb: {}
          f:vpc:
            f:cidrBlock: {}
            f:tags:
              .: {}
              f:Name: {}
    manager: cluster-api-provider-aws-controller ⬅️
    operation: Update
    time: "2022-06-15T12:55:43Z"
    ...

pydctw avatar Jun 15 '22 16:06 pydctw

/triage accepted /priority important-soon

sedefsavas avatar Jun 15 '22 19:06 sedefsavas

OCI provider fix for the problem: https://github.com/oracle/cluster-api-provider-oci/pull/116

sedefsavas avatar Jul 28 '22 19:07 sedefsavas