k8s-cluster-api-provider icon indicating copy to clipboard operation
k8s-cluster-api-provider copied to clipboard

Prefix all resources with PREFIX and CLUSTER_NAME

Open garloff opened this issue 1 year ago • 8 comments

As a cluster administrator, I want to be able to identify all IaaS resources that belong to a cluster, so I know what to save and what I can possibly safely delete when cleaning up.

Looking at the various resources at the OpenStack level, we do not consistently prefix everything with (per mgmthost) prefix and clustername. Here's what we find:

MgmtServer:

  • VM: $PREFIX-mgmtcluster (OK)
  • Root Volume: Unnamed (Not Good)
  • AppCred: $PREFIX-appcred (OK)
  • Port: $PREFIX-port (OK)
  • Network: $PREFIX-net (OK)
  • Subnet: $PREFIX-subnet (OK)
  • FloatingIP: Unnamed (but that can't be fixed -- we could assign a tag ... or at least pu a description)
  • SGs: $PREFIX-allow-icmp/ssh, $PREFIX-mgmt (OK)
  • Router: $PREFIX-rtr (OK)
  • Images: ubuntu-capi-image-v1.xx.x (We want this, as these should be shared for performance and cost)

Cluster

  • VMs: $PREFIX-$CLUSTER_NAME-md/control-plane (OK)
  • Root Volumes: $PREFIX-$CLUSTER_NAME-*-root (OK)
  • AppCred: $PREFIX-$CLUSTER_NAME-appcred (OK)
  • Ports: $PREFIX-$CLUSTER_NAME-md/control-plane-* (OK)
  • Networks: k8s-clusterapi-cluster-$CLUSTER_NAME-$CLUSTER_NAME (Not good: Lacks $PREFIX)
  • Subnets: dito (Not good)
  • SGs: $PREFIX-$CLUSTER_NAME-cilium (OK), k8s-cluster-$CLUSTER_NAME-$CLUSTER_NAME-secgroup-worker/controlplane (Not good: Lacks $PREFIX)
  • Router: k8s-clusterapi-cluster-$CLUSTER_NAME-$CLUSTER_NAME (Not good: Lacks $PREFIX)
  • Floating IP (kubeapi): Unnamed (we can't fix that. The description has the $CLUSTER_NAME, but not the $PREFIX, cloud also use a tag)
  • Loadbalancers (kubeapi): k8s-clusterapi-cluster-$CLUSTER_NAME-$CLUSTER_NAME-kubeapi (not good: Lacks $PREFIX)

Ingress

  • Floating IP (ingress): Unnamed (can't be fixed. The description has the $CLUSTER_NAME inside, but not the $PREFIX. Could also use a tag)
  • Loadbalancer (ingress): kube_service_$CLUSTER_NAME_ingress* (Not good: Lacks $PREFIX)
  • SGs (ingress): lb-sg-UUID-ingress* (Not good: Lacks $PREFIX and $CLUSTER_NAME, though the latter finds itself in the description)

Persistent Volumes

  • Volumes (PVs from CSI): pvc-UUID (Not good: Lacks $PREFIX and $CLUSTER_NAME -- and $VOLUME_NAME)

I may have overlooked some resource.

So, we should look at

  • mgmtcluster root volume (terraform)
  • FIP description/tag to mgmtcluster
  • Networks and Subnets pre cluster (lack $PREFIX)
  • Worker/Controlplane SGs (lack $PREFIX)
  • Same for per-cluster router
  • FIPs for kubeapi and ingress: description/tag should container $PREFIX and $CLUSTER_NAME
  • LBs for kubeapi and ingress lack $PREFIX in name
  • SGs (ingress) lack $PREFIX and $CLUSTER_NAME
  • PVs lack everything ...

I am not sure that all of this can easily be addressed without hacking OCCM/CSI. But then we maybe should do that ...

garloff avatar Aug 14 '23 21:08 garloff

I should mention that providing consistency here would make the make fullclean option a lot better and safer. (See #484, #492).

garloff avatar Aug 14 '23 21:08 garloff

Networking

The CAPO has in its networking module following:

const (
	networkPrefix string = "k8s-clusterapi"
	trunkResource string = "trunks"
	portResource  string = "ports"
)

https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/cc304b8af1d5f637419273b0702a0311b3c17b01/pkg/cloud/services/networking/service.go#L32

Router

Then the router name is obtained by:

func getRouterName(clusterName string) string {
	return fmt.Sprintf("%s-cluster-%s", networkPrefix, clusterName)
}

https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/main/pkg/cloud/services/networking/router.go#L286C1-L288C2

Subnet

The network and subnet name is obtained by:

func getSubnetName(clusterName string) string {
	return fmt.Sprintf("%s-cluster-%s", networkPrefix, clusterName)
}

func getNetworkName(clusterName string) string {
	return fmt.Sprintf("%s-cluster-%s", networkPrefix, clusterName)
}

https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/main/pkg/cloud/services/networking/network.go#L391C1-L397C2

Floating IP description

func GetDescription(clusterName string) string {
	return fmt.Sprintf("Created by cluster-api-provider-openstack cluster %s", clusterName)
}

https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/main/pkg/utils/names/names.go#L23-L25

However, this description get used in multiple resources:

  • floating IP
  • trunk
  • port
  • router
  • loadbalancer
  • network

Floating IP Tags

	fpCreateOpts.FloatingNetworkID = openStackCluster.Status.ExternalNetwork.ID
	fpCreateOpts.Description = names.GetDescription(clusterName)

https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/main/pkg/cloud/services/networking/floatingip.go#L50-L51

Here it would be probably possible to add tags.

Loadbalancer

The CAPO has the following in the load balancer module:

const (
	networkPrefix   string = "k8s-clusterapi"
	kubeapiLBSuffix string = "kubeapi"
)

https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/cc304b8af1d5f637419273b0702a0311b3c17b01/pkg/cloud/services/loadbalancer/loadbalancer.go#L42C1-L45

And then the name is obtained by:

func getLoadBalancerName(clusterName string) string {
	return fmt.Sprintf("%s-cluster-%s-%s", networkPrefix, clusterName, kubeapiLBSuffix)
}

https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/cc304b8af1d5f637419273b0702a0311b3c17b01/pkg/cloud/services/loadbalancer/loadbalancer.go#L569-L571

fdobrovolny avatar Aug 17 '23 11:08 fdobrovolny

Security Groups

const (
	secGroupPrefix     string = "k8s"
	controlPlaneSuffix string = "controlplane"
	workerSuffix       string = "worker"
	bastionSuffix      string = "bastion"
	remoteGroupIDSelf  string = "self"
)

https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/main/pkg/cloud/services/networking/securitygroups.go#L30-L36

func getSecControlPlaneGroupName(clusterName string) string {
	return fmt.Sprintf("%s-cluster-%s-secgroup-%s", secGroupPrefix, clusterName, controlPlaneSuffix)
}

func getSecWorkerGroupName(clusterName string) string {
	return fmt.Sprintf("%s-cluster-%s-secgroup-%s", secGroupPrefix, clusterName, workerSuffix)
}

func getSecBastionGroupName(clusterName string) string {
	return fmt.Sprintf("%s-cluster-%s-secgroup-%s", secGroupPrefix, clusterName, bastionSuffix)
}

https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/main/pkg/cloud/services/networking/network.go#L391C1-L397C2

fdobrovolny avatar Aug 17 '23 11:08 fdobrovolny

Ingress

In cloud provider for OpenStack in load balancer following setting is set:

Load balancer

const (
	servicePrefix                       = "kube_service_"
...

https://github.com/kubernetes/cloud-provider-openstack/blob/f6ca48ee768260444db6762447f1cd071f52b6fd/pkg/openstack/loadbalancer.go#L61

func (lbaas *LbaasV2) GetLoadBalancerName(_ context.Context, clusterName string, service *corev1.Service) string {
	name := fmt.Sprintf("%s%s_%s_%s", servicePrefix, clusterName, service.Namespace, service.Name)
	return cpoutil.CutString255(name)
}

https://github.com/kubernetes/cloud-provider-openstack/blob/f6ca48ee768260444db6762447f1cd071f52b6fd/pkg/openstack/loadbalancer.go#L586-L589

SecurityGroup

func getSecurityGroupName(service *corev1.Service) string {
	securityGroupName := fmt.Sprintf("lb-sg-%s-%s-%s", service.UID, service.Namespace, service.Name)
	//OpenStack requires that the name of a security group is shorter than 255 bytes.
	if len(securityGroupName) > 255 {
		securityGroupName = securityGroupName[:255]
	}

	return securityGroupName
}

https://github.com/kubernetes/cloud-provider-openstack/blob/f6ca48ee768260444db6762447f1cd071f52b6fd/pkg/openstack/loadbalancer.go#L415-L423

Floating IP

			floatIPOpts := floatingips.CreateOpts{
				FloatingNetworkID: svcConf.lbPublicNetworkID,
				PortID:            portID,
				Description:       fmt.Sprintf("Floating IP for Kubernetes external service %s from cluster %s", serviceName, clusterName),
			}

https://github.com/kubernetes/cloud-provider-openstack/blob/f6ca48ee768260444db6762447f1cd071f52b6fd/pkg/openstack/loadbalancer.go#L978-L982

fdobrovolny avatar Aug 17 '23 11:08 fdobrovolny

Summary

  • I have not been able to find where PVCs get their names.
  • I have created a PR for updating our terraform.
  • I have identified that the names cannot be updated without changes to the core software.
  • The name generations in all cases rely on constants, so it should not be hard to make those constants configurable and likely to make them into upstream change.
  • The floating IP description could be changed and also be made configurable.
  • Adding tags to floating IP seems realistic as well.

fdobrovolny avatar Aug 17 '23 11:08 fdobrovolny

  • I have not been able to find where PVCs get their names.

In the past, I created an issue yaook/k8s#465. It is not about PV names(the user specifies PVC names, and PV names look random(probably some request ID)) but their description and tags.

Each volume created by cinder CSI has description Created by OpenStack Cinder CSI driver. Also, the tag cinder.csi.openstack.org/cluster=<cluster_name> is added there. We already set this value here. So in the volume metadata we can see property like cinder.csi.openstack.org/cluster='testcluster'. The same happens with volume snapshots.

chess-knight avatar Aug 24 '23 07:08 chess-knight

Let's split this in multiple tasks

jschoone avatar Sep 07 '23 09:09 jschoone

After the ClusterClass feature #600, control-plane resources also lack $PREFIX. In this issue, it was marked as OK, but now it should be:

  • VMs: $CLUSTER_NAME-control-plane (Not good: Lacks $PREFIX)
  • Ports: $CLUSTER_NAME-control-plane-* (Not good: Lacks $PREFIX)
  • Root Volume: ...

Because of this change, we had to also modify our full cleanup script, from grep $PREFIX-$CLUSTER_NAME to grep "$PREFIX-$CLUSTER_NAME\|$CLUSTER_NAME-control-plane" which is IMO not good.

Also, in the PR https://github.com/SovereignCloudStack/k8s-cluster-api-provider/pull/661#issuecomment-1842931890, the naming of control-plane resources will be reduced even more.

The naming of ClusterClass resources can be seen in the CAPI issue https://github.com/kubernetes-sigs/cluster-api/issues/9135 - we can also see, that machine deployment resources have no problem so far because, in the naming scheme, there is also {topology-name} which is in our case is "${PREFIX}-${CLUSTER_NAME}-md-0-no1".

The only way as far as I can see for now is to add a prefix to control-plane resources by changing the cluster name itself by adding of prefix there.

chess-knight avatar Dec 18 '23 08:12 chess-knight

I'll close this because it's related to KaaS v1 which will be deprecated after R7. Feel free to reopen, if I'm wrong

jschoone avatar Jun 18 '24 12:06 jschoone

I'll close this because it's related to KaaS v1 which will be deprecated after R7. Feel free to reopen, if I'm wrong

Just realized, I was wrong. We have the problem, that the name for the ingress-nginx loadbalancer is always the same and then it errors inside of openstack when there are more then one, related to this comment. @michal-gubricky please find out how to configure clusterName using Cluster API to avoid changing code upstream.

jschoone avatar Jul 03 '24 09:07 jschoone

Seems like this value will be enough to set.

jschoone avatar Jul 04 '24 09:07 jschoone

Seems like this value will be enough to set.

I will take a look at it and test it.

michal-gubricky avatar Jul 04 '24 12:07 michal-gubricky

Seems like this value will be enough to set.

I will take a look at it and test it.

After investigation, setting this variable should do the job. The cluster name is set to kubernetes by default via an environment variable. This can be seen, for example, by describing a pod of OCCM.

    Environment:
      CLOUD_CONFIG:  /etc/config/cloud.conf
      CLUSTER_NAME:  kubernetes

and can also be found in values.yaml

michal-gubricky avatar Jul 04 '24 14:07 michal-gubricky