capsule
capsule copied to clipboard
object count resource quotas not working and breaking other Tenant functionality
Bug description
The Tenant CR let's you define the kubernetes resource quotas. One such resource quota is object count, for example jobs count is defined by count/jobs.batch. While defining such configuration, the Tenant object is created, but this silently fails. Moreover, so of the other functionality is not working as expected. This was initially fixed in https://github.com/projectcapsule/capsule/issues/507 in v1beta1 api, but somehow it was not moved to v1beta2
How to reproduce
- Create a
Tenantresource:
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
name: dummytenant
owners:
- clusterRoles:
- admin
- capsule-namespace-deleter
kind: ServiceAccount
name: system:serviceaccount:ns:dummy
resourceQuotas:
items:
- hard:
count/jobs.batch: "2"
limits.cpu: 300m
limits.ephemeral-storage: 1Gi
limits.memory: 1200Mi
scope: Tenant
- Check the underlying
ResourceQuotacreated, it should not contain any jobs quota. - Moreover, try to create a
Namespaceas the owner:
kubectl --as=system:serviceaccount:ns:dummy create ns dummytenant-ns1
- This will create the namespace successfully, but when you try to access it with the same owner
ServiceAccount, you will not be allowed:
$ kubectl -as=system:serviceaccount:ns:dummy get ns dummytenant-ns1
Error from server (Forbidden): namespaces "dummytenant-ns1" is forbidden: User "system:serviceaccount:ns:dummy " cannot get resource "namespaces" in API group "" in the namespace "dummytenant-ns1"
This is caused by the fact that capsule does not created the RoleBindings associated with the new namespace. If you check the RoleBindings you will not see any created for the respective Namespace
$ kubectl get rolebindings -A | grep dummytenant
- If you repeat all steps above without the
count/jobs.batchconfiguration, theRoleBindingswill be created as expected.
Expected behavior
Tenantshould be able to definecount/jobs.batchResourceQuotas.- Errors related to a configuration should not perpetuate to other functionality (see
Namespaceexample above).
Logs
Error logs
ResourceQuota \"capsule-dummytenant-0\" is invalid: [metadata.annotations: Invalid value: \"quota.capsule.clastix.io/hard-count/jobs.batch\": a qualified name must consist of alphanumeric characters, '-', '_' or '.', and must start and end with an alphanumeric character (e.g. 'MyName', or 'my.name', or '123-abc', regex used for validation is '([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9]') with an optional DNS subdomain prefix and '/' (e.g. 'example.com/MyName'), metadata.annotations: Invalid value: \"quota.capsule.clastix.io/used-count/jobs.batch\": a qualified name must consist of alphanumeric characters, '-', '_' or '.', and must start and end with an alphanumeric character (e.g. 'MyName', or 'my.name', or '123-abc', regex used for validation is '([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9]') with an optional DNS subdomain prefix and '/' (e.g. 'example.com/MyName')]
Additional context
- Capsule version: v0.3.3
- Helm Chart version: 0.4.3
- Kubernetes version: v1.27.7
Thanks for opening this, @adabuleanu.
I'll try to rise a PR to solve this, it would be great if you could give it a try!
@adabuleanu I'm testing this by running Capsule v0.5.0 (ghcr.io/projectcapsule/capsule:v0.5.0) but I'm not able to replicate the issue.
$: kubectl get tnt dummytenant -o yaml
apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"capsule.clastix.io/v1beta2","kind":"Tenant","metadata":{"annotations":{},"name":"dummytenant"},"spec":{"owners":[{"clusterRoles":["admin","capsule-namespace-deleter"],"kind":"ServiceAccount","name":"system:serviceaccount:ns:dummy"}],"resourceQuotas":{"items":[{"hard":{"count/jobs.batch":"2","limits.cpu":"300m","limits.ephemeral-storage":"1Gi","limits.memory":"1200Mi"}}],"scope":"Tenant"}}}
creationTimestamp: "2024-01-23T19:43:42Z"
generation: 2
labels:
kubernetes.io/metadata.name: dummytenant
name: dummytenant
resourceVersion: "3568938"
uid: 5aa9eb2a-c78b-4805-9c41-1c7af6865afb
spec:
ingressOptions:
hostnameCollisionScope: Disabled
limitRanges: {}
networkPolicies: {}
owners:
- clusterRoles:
- admin
- capsule-namespace-deleter
kind: ServiceAccount
name: system:serviceaccount:ns:dummy
resourceQuotas:
items:
- hard:
count/jobs.batch: "2"
limits.cpu: 300m
limits.ephemeral-storage: 1Gi
limits.memory: 1200Mi
scope: Tenant
status:
namespaces:
- dummytenant-test
size: 1
state: Active
$: kubectl -n dummytenant-test get resourcequota
NAME AGE REQUEST LIMIT
capsule-dummytenant-0 101s count/jobs.batch: 0/2 limits.cpu: 0/300m, limits.ephemeral-storage: 0/1Gi, limits.memory: 0/1200Mi
I don't think we have a specific issue between the Tenant APIs, since we have a webhook conversion and those are annotations which are always the same between different versions.
I suspect you're using a old version of Capsule, you can grab more details in the first logs upon startup of the Capsule pod.
{"level":"info","ts":"2024-01-23T19:48:12.764Z","logger":"setup","msg":"Capsule Version v0.5.0 74d3ac5dirty"}
{"level":"info","ts":"2024-01-23T19:48:12.764Z","logger":"setup","msg":"Build from: https://github.com/projectcapsule/capsule"}
{"level":"info","ts":"2024-01-23T19:48:12.764Z","logger":"setup","msg":"Build date: "}
{"level":"info","ts":"2024-01-23T19:48:12.764Z","logger":"setup","msg":"Go Version: go1.20.11"}
{"level":"info","ts":"2024-01-23T19:48:12.764Z","logger":"setup","msg":"Go OS/Arch: linux/amd64"}
I was able to reproduce this issue with 0.3.3 in a customer environment. @adabuleanu were you able to upgrade to a newer release?