fleet icon indicating copy to clipboard operation
fleet copied to clipboard

OCI Experimental Storage: job service account missing RBAC?

Open puffitos opened this issue 5 months ago • 0 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Current Behavior

We’ve been trying to use the OCI experimental feature to allow bigger bundles containing multiple versions of the same helm chart (to enable canary deployments via modifying the fleet.yaml). This seems to address https://github.com/rancher/fleet/issues/2442, which is absolutely great.

After going through some authentication problems with our quay registry, we’ve reached a point, where this seems to be a fleet problem.

As the logs of the job created by the GitRepo with the experimental OCI feature show below, some secret needs to be created first by the serviceaccount of the pod. Unfortunately, the default role created by fleet for those serviceaccounts doesn’t provide such RBAC (see log section for the Role bound to the serviceaccount in question).

time="2024-09-25T08:41:28Z" level=fatal msg="secrets is forbidden: User \"system:serviceaccount:fleet-default:git-caas-cluster-monitoring\" cannot create resource \"secrets\" in API group \"\" in the namespace \"fleet-default\""
Stream closed EOF for fleet-default/caas-cluster-monitoring-46bd3-ftfx7 (fleet)

After changing the ability to create secrets, I was briefly disappointed to see the old error message:

_=/usr/bin/env
time="2024-09-25T08:54:45Z" level=fatal msg="rpc error: code = ResourceExhausted desc = trying to send message larger than max (2489846 vs. 2097152)"
Stream closed EOF for fleet-default/caas-cluster-monitoring-46bd3-6tppb (fleet)

After that, I just deleted the created Bundle and tried again, and it seems to have worked. I’m baffled as why the error would appear at first but not the second time, but that's a different issue / fleet tried to update the bundle and it already contained too much info?

Expected Behavior

The feature works out of the box without having to manually change the Role.

Steps To Reproduce

  1. (optional) Deploy a GitRepo that deploys a big chart in multiple clusters (like kube-prometheus-stack)
  2. Use the OCI experimental feature for this gitrepo and set up a repository and credentials for the OCI backend
  3. When the pod runs to create the bundle, the error message should appear, because the serviceaccount has no RBAC to create secrets

Environment

- Architecture: arm64
- Fleet Version: 0.10.1
- Cluster:
  - Provider: rke
  - Options: - 
  - Kubernetes Version: 1.29

Logs

<details>
<summary>Role</summary>



apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  annotations:
    objectset.rio.cattle.io/applied: H4sIAAAAAAAA/5SSTW/UMBCG/wqas7M022TzIXFD6oEDUoW4oBzG9jjrrmNH9mQ5rPLfkVOgCLRdestk3vl45vUFJmLUyAj9BdD7wMg2+JTDIJ9IcSLeRRt2Cpkd7Wx4bzX0MFp+CjKBuCoL3z3FYjyfoAfjiPiP3LlENx+xFO8+Wa8/PFh+pDncbOZxIuhBIaZCuSUxxWIK3nKI1o//VZ5mVPRro0KTwcUxrAJUpA39i50oMU4z9H5xToBDSe7VgxwxHaGH+1JSS1iZtmmwLev7tq3bThrTdFjhQR/INF0lD3naT5LRcnGd5vq+AjaiRzIUyStK0H+7AM72K8Vkg3/l5CBAuqBOn3ODj+SIN71Bl0iACp5jcI7i718n67PhLx7ddGHZXoiSNTZ7aouuK++KqjV1Iat6XyDu5Z02jdpLA+uwCoiLeyF4iGGZc/Q3AQwCIqWwxGdekIvXuVCAnXCkpNCnLDpTlJtgpHypzVjKa836+cPZlBM6wxMMq3jj4NFypDn8O2xYh/VHAAAA//9NpKSyUwMAAA
    objectset.rio.cattle.io/id: gitjobs
    objectset.rio.cattle.io/owner-gvk: fleet.cattle.io/v1alpha1, Kind=GitRepo
    objectset.rio.cattle.io/owner-name: caas-cluster-monitoring
    objectset.rio.cattle.io/owner-namespace: fleet-default
  creationTimestamp: "2023-11-30T22:14:04Z"
  finalizers:
  - wrangler.cattle.io/auth-prov-v2-role
  labels:
    objectset.rio.cattle.io/hash: 31be8ea4f877a815388589bff79a4a6d6ef794b6
  name: git-caas-cluster-monitoring
  namespace: fleet-default
  ownerReferences:
  - apiVersion: fleet.cattle.io/v1alpha1
    blockOwnerDeletion: false
    controller: false
    kind: GitRepo
    name: caas-cluster-monitoring
    uid: cb5a72e8-9910-48f5-b452-aa2b0df7c2bf
  resourceVersion: "677449923"
  uid: 5a45a1c1-6a06-40c5-893e-81f5bfbca9a0
rules:
- apiGroups:
  - fleet.cattle.io
  resources:
  - bundles
  - imagescans
  verbs:
  - get
  - create
  - update
  - list
  - delete
- apiGroups:
  - fleet.cattle.io
  resources:
  - gitrepos
  verbs:
  - get

</details>

<details>

<summary>GitRepo</summary>
```yaml
apiVersion: fleet.cattle.io/v1alpha1
kind: GitRepo
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"fleet.cattle.io/v1alpha1","kind":"GitRepo","metadata":{"annotations":{},"name":"caas-cluster-monitoring","namespace":"fleet-default"},"spec":{"branch":"feat/TestNewChartVersion","clientSecretName":"fleet","insecureSkipTLSVerify":false,"paths":["./"],"repo":"https://gitlab.devops.telekom.de/caas/cluster-maintenance/apps-manifests/caas-cluster-monitoring.git","targets":[{"clusterSelector":{"matchExpressions":[{"key":"management.cattle.io/cluster-display-name","operator":"In","values":["s03"]}]}},{"clusterSelector":{"matchExpressions":[{"key":"environment","operator":"In","values":["staging"]}]}}]}}
  creationTimestamp: "2023-11-30T22:14:04Z"
  finalizers:
  - fleet.cattle.io/gitrepo-finalizer
  generation: 90
  name: caas-cluster-monitoring
  namespace: fleet-default
  resourceVersion: "1035164670"
  uid: cb5a72e8-9910-48f5-b452-aa2b0df7c2bf
spec:
  branch: feat/TestNewChartVersion
  clientSecretName: fleet
  correctDrift: {}
  forceSyncGeneration: 79
  imageScanCommit:
    authorEmail: ""
    authorName: ""
  keepResources: true
  ociRegistry:
    authSecretName: caas-oci-storage-auth-secret
    reference: mtr.devops.telekom.de/caas/fleet-oci-storage
  paths:
  - ./
  repo: https://gitlab.devops.telekom.de/caas/cluster-maintenance/apps-manifests/caas-cluster-monitoring.git
  targets:
  - clusterSelector:
      matchExpressions:
      - key: management.cattle.io/cluster-display-name
        operator: In
        values:
        - s03
  - clusterSelector:
      matchExpressions:
      - key: environment
        operator: In
        values:
        - staging
status:
  commit: 4000c70f210c176dd569423f2e25623e4497811d
  conditions:
  - lastUpdateTime: "2024-09-25T08:02:46Z"
    status: "True"
    type: Ready
  - lastUpdateTime: "2024-09-11T08:35:20Z"
    status: "True"
    type: Accepted
  - lastUpdateTime: "2023-11-30T22:14:04Z"
    status: "True"
    type: ImageSynced
  - lastUpdateTime: "2024-09-25T08:15:05Z"
    status: "False"
    type: Reconciling
  - lastUpdateTime: "2024-09-25T08:41:31Z"
    message: |
      Job Failed. failed: 3/1time="2024-09-25T08:41:28Z" level=fatal msg="secrets is forbidden: User \"system:serviceaccount:fleet-default:git-caas-cluster-monitoring\" cannot create resource \"secrets\" in API group \"\" in the namespace \"fleet-default\""
    reason: Stalled
    status: "True"
    type: Stalled
  - lastUpdateTime: "2024-09-11T08:34:28Z"
    status: "True"
    type: Synced
  - lastUpdateTime: "2024-09-21T01:34:06Z"
    status: "True"
    type: GitPolling
  desiredReadyClusters: 0
  display:
    readyBundleDeployments: 0/0
    state: GitUpdating
  gitJobStatus: Failed
  lastPollingTriggered: "2024-09-25T08:51:25Z"
  observedGeneration: 90
  readyClusters: 0
  resourceCounts:
    desiredReady: 0
    missing: 0
    modified: 0
    notReady: 0
    orphaned: 0
    ready: 0
    unknown: 0
    waitApplied: 0
  summary:
    desiredReady: 0
    ready: 0
  updateGeneration: 79
```

Anything else?

No response

puffitos avatar Sep 25 '24 11:09 puffitos