vcluster icon indicating copy to clipboard operation
vcluster copied to clipboard

error when synching StorageClass (vcluster->host) after vcluster restart

Open joaocc opened this issue 1 year ago • 4 comments

What happened?

Configured vcluster to sync storage classes. Added a couple of storage classes (efs.csi driver with parameters) and they replicate without any issues. After restartingvcluster pods, I get error when patching "manifests": StorageClass.storage.k8s.io "efs--something" is invalid: parameters: Forbidden: updates to parameters are forbidden.

What did you expect to happen?

That storage classes synchronize correctly.

How can we reproduce it (as minimally and precisely as possible)?

Storage class inside vcluster

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: efs--0adm-root
  annotations:
    something: here
provisioner: efs.csi.aws.com
parameters:
  basePath: /
  directoryPerms: '700'
  ensureUniqueDirectory: ''
  fileSystemId: fs-00000000001111118
  fsType: efs
  gid: ''
  gidRangeEnd: ''
  gidRangeStart: ''
  provisioningMode: efs-ap
  subPathPattern: /
  uid: ''
reclaimPolicy: Retain
mountOptions:
  - noatime
allowVolumeExpansion: true
volumeBindingMode: Immediate

Anything else we need to know?

Deleting the storage classes in host makes synch work well (recreated), but restarting vcluster will cause the same issues

Host cluster Kubernetes version

$ kubectl version
Client Version: v1.28.3
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.27.7-eks-4f4795d```

</details>


### Host cluster Kubernetes distribution

<details>

EKS 1.27


</details>


### vlcuster version

<details>

```console
$ vcluster --version
vcluster version 0.16.4

Vcluster Kubernetes distribution(k3s(default)), k8s, k0s)

eks

OS and Arch

N/A

joaocc avatar Nov 21 '23 14:11 joaocc

@joaocc thanks for creating this issue! I don't think we can do much there, you will need to remove it from manifests or delete it within vcluster and then add it again with the new parameters

FabianKramm avatar Nov 27 '23 12:11 FabianKramm

Not sure I understand the answer. Shouldn't the storage class sync work without causing these issues, which make the system unreliable (as in any restart of controller pods will cause the sync to fail)? Please note that there were no changes to the storage classes, only a restart (or failover) of vcluster pods. I haven't seen anything in the documentation stating that only specific types of storage classes (or similar) would work with the sync mechanism, or that specific fields are not correctly treated. I'm struggling to understand if you are acknowledging this is a bug but will not fix it, or if there is any workaround on this which I am not understanding. Thanks

joaocc avatar Nov 27 '23 13:11 joaocc

@joaocc the problem here is that vcluster cannot apply the storage class changes from your manifests configuration because the storage class is read only. Its like trying to do a kubectl apply -f storageclass.yaml on an existing storage class, which won't work because it already exists and cannot be changed. So you will need to manually remove that before vcluster will be apply to apply it again unfortunately. We don't know what objects have this limitation so this something thats pretty difficult todo on our side, as we also don't want to delete and recreate as this could cause other issues

FabianKramm avatar Nov 30 '23 08:11 FabianKramm

Hi. Thanks for the update. Let me try to explain why I was thinking this should not be necessary.

  • Storage classes are indeed write-once.
  • Re-running kubectl apply -f storageclass.yaml (or the equivalent terraform kubectl_manifestor kubernetes_manifest) should be OK as long as there are no changes to the storage class, as kubectl is idempotent.
  • Problems usually occur if:
    1. kubectl create is called/used instead of kubectl apply (or whatever equivalent is on k8s APIs);
    2. there are changes to the content of the storage class (which can happen if sync replicates a set of parameters that include dynamically generated ones - think UID or others)
  • I didn't have time to look at vcluster code for sync of storage classes, but maybe one or both of the points are happening; since it is controlled by a separate sync parameter (sync.storageclasses), I was assuming it would be possible (conceptually) to have a specific sync behaviour for storage classes (where the API indicates the fields that can be safely re-applied).

Hope this makes sense, and please excuse me if I am doing any oversimplification of how syncer works :) Thanks

joaocc avatar Nov 30 '23 16:11 joaocc

Hey! I was not able to reproduce your issue, the vcluster pods started again without any problem on my end. I will close this issue, please reopen if you are able to reproduce with the last vcluster version and please send us both the synced and host storage classes, plus any relevant logs

facchettos avatar Feb 19 '24 15:02 facchettos