fleet icon indicating copy to clipboard operation
fleet copied to clipboard

RPC error: ResourceExhausted: trying to send message larger than max 2097152

Open moriahpopo opened this issue 1 year ago • 4 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Current Behavior

I'm trying to deploy my helm chart using fleet.yaml file. I get this error from the gitJob pod: level=fatal msg="rpc error: code = ResourceExhausted desc = trying to send message larger than max (2487345 vs. 2097152)"

Expected Behavior

Pulling the helm chart and deploying to the target clusters

Steps To Reproduce

  1. Install rancher and fleet
  2. add a git repository
  3. add to the repository a fleet.yaml file that directs to helm chart using "oci" url

Environment

- Architecture: x86_64 GNU/Linux
- Fleet Version: v0.9.4
- Cluster: (The issue is before the binding to the cluster, so I believe these are less important)
  - Provider: EKS
  - Options: 14 nodes, storageclasses and loadbalancer are defined
  - Kubernetes Version: 1.29

Logs

fleet.yaml

namespace: default
helm:
  chart: "oci://registry.XXX.com/charts/my-chart"
  version: "7.2.152-09-05-2024-21719"
  force: false
  timeoutSeconds: 60
  valuesFiles:
    - custom_values.yaml
  maxUnavailable: 15%
  maxUnavailablePartitions: 20%
  autoPartitionSize: 10%
  partitions:
    - name: canary
      maxUnavailable: 10%
      clusterSelector:
        matchLabels:
          env: dev
      clusterGroup: dev-group
      clusterGroupSelector:
        clusterSelector:
          matchLabels:
            env: dev
<details>
<summary>Click to expand<summary>
$ kubectl describe gitrepo repo-rancher -n fleet-default
Name:         repo-rancher
Namespace:    fleet-default
Labels:       <none>
Annotations:  <none>
API Version:  fleet.cattle.io/v1alpha1
Kind:         GitRepo
Metadata:
  Creation Timestamp:  2024-05-16T07:52:13Z
  Generation:          1
  Managed Fields:
    API Version:  fleet.cattle.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:kubectl.kubernetes.io/last-applied-configuration:
      f:spec:
        .:
        f:branch:
        f:clientSecretName:
        f:repo:
    Manager:      kubectl-client-side-apply
    Operation:    Update
    Time:         2024-05-16T07:52:13Z
    API Version:  fleet.cattle.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        .:
        f:commit:
        f:conditions:
        f:desiredReadyClusters:
        f:display:
          .:
          f:readyBundleDeployments:
        f:gitJobStatus:
        f:lastSyncedImageScanTime:
        f:observedGeneration:
        f:readyClusters:
        f:resourceCounts:
          .:
          f:desiredReady:
          f:missing:
          f:modified:
          f:notReady:
          f:orphaned:
          f:ready:
          f:unknown:
          f:waitApplied:
        f:summary:
          .:
          f:desiredReady:
          f:ready:
    Manager:         fleetcontroller
    Operation:       Update
    Time:            2024-05-20T07:51:23Z
  Resource Version:  11172606
  UID:               e37f7dbe-5087-487b-a508-bd4b4f5e1995
Spec:
  Branch:              my-branch
  Client Secret Name:  ssh-key-for-rancher
  Repo:                [email protected]:XXX/rancher.git
Status:
  Commit:  f50ecc2f785a67549eb03c52a15d1f787008e5cf
  Conditions:
    Last Update Time:      2024-05-16T07:52:13Z
    Status:                True
    Type:                  Ready
    Last Update Time:      2024-05-20T07:52:03Z
    Status:                True
    Type:                  Accepted
    Last Update Time:      2024-05-16T07:52:13Z
    Status:                True
    Type:                  ImageSynced
    Last Update Time:      2024-05-20T07:51:23Z
    Status:                False
    Type:                  Reconciling
    Last Update Time:      2024-05-20T07:51:23Z
    Status:                False
    Type:                  Stalled
    Last Update Time:      2024-05-20T07:52:03Z
    Status:                True
    Type:                  Synced
  Desired Ready Clusters:  0
  Display:
    Ready Bundle Deployments:   0/0
  Git Job Status:               Current
  Last Synced Image Scan Time:  <nil>
  Observed Generation:          1
  Ready Clusters:               0
  Resource Counts:
    Desired Ready:  0
    Missing:        0
    Modified:       0
    Not Ready:      0
    Orphaned:       0
    Ready:          0
    Unknown:        0
    Wait Applied:   0
  Summary:
    Desired Ready:  0
    Ready:          0
Events:             <none>
<details>

<details>
<summary>Click to expand<summary>
kubectl get gitrepo -A -o jsonpath='{.items[*].status}'
{"commit":"f50ecc2f785a67549eb03c52a15d1f787008e5cf","conditions":[{"lastUpdateTime":"2024-05-16T07:52:13Z","status":"True","type":"Ready"},{"lastUpdateTime":"2024-05-20T07:54:28Z","status":"True","type":"Accepted"},{"lastUpdateTime":"2024-05-16T07:52:13Z","status":"True","type":"ImageSynced"},{"lastUpdateTime":"2024-05-20T07:54:26Z","status":"False","type":"Reconciling"},{"lastUpdateTime":"2024-05-20T07:54:26Z","status":"False","type":"Stalled"},{"lastUpdateTime":"2024-05-20T07:54:28Z","status":"True","type":"Synced"}],"desiredReadyClusters":0,"display":{"readyBundleDeployments":"0/0"},"gitJobStatus":"Current","lastSyncedImageScanTime":null,"observedGeneration":1,"readyClusters":0,"resourceCounts":{"desiredReady":0,"missing":0,"modified":0,"notReady":0,"orphaned":0,"ready":0,"unknown":0,"waitApplied":0},"summary":{"desiredReady":0,"ready":0}} {"commit":"f50ecc2f785a67549eb03c52a15d1f787008e5cf","conditions":[{"lastUpdateTime":"2024-05-16T15:41:19Z","status":"True","type":"Ready"},{"lastUpdateTime":"2024-05-20T07:54:14Z","status":"True","type":"Accepted"},{"lastUpdateTime":"2024-05-16T15:41:19Z","status":"True","type":"ImageSynced"},{"lastUpdateTime":"2024-05-20T07:54:09Z","status":"False","type":"Reconciling"},{"lastUpdateTime":"2024-05-20T07:54:09Z","status":"False","type":"Stalled"},{"lastUpdateTime":"2024-05-20T07:54:14Z","status":"True","type":"Synced"}],"desiredReadyClusters":0,"display":{"readyBundleDeployments":"0/0"},"gitJobStatus":"Current","lastSyncedImageScanTime":null,"observedGeneration":5,"readyClusters":0,"resourceCounts":{"desiredReady":0,"missing":0,"modified":0,"notReady":0,"orphaned":0,"ready":0,"unknown":0,"waitApplied":0},"summary":{"desiredReady":0,"ready":0}}
<details>

<details>
<summary>Click to expand<summary>
kubectl logs repo-rancher-42419-dxrr8 -n fleet-default
HOSTNAME=repo-rancher-42419-dxrr8
KUBERNETES_PORT_443_TCP_PROTO=tcp
COMMIT=f50ecc2f785a67549eb03c52a15d1f787008e5cf
KUBERNETES_PORT_443_TCP_ADDR=10.96.0.1
KUBERNETES_PORT=tcp://10.96.0.1:443
PWD=/workspace/source
HOME=/fleet-home
KUBERNETES_SERVICE_PORT_HTTPS=443
KUBERNETES_PORT_443_TCP_PORT=443
KUBERNETES_PORT_443_TCP=tcp://10.96.0.1:443
SHLVL=1
KUBERNETES_SERVICE_PORT=443
EVENT_TYPE=
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
KUBERNETES_SERVICE_HOST=10.96.0.1
_=/usr/bin/env
time="2024-05-20T07:58:14Z" level=fatal msg="rpc error: code = ResourceExhausted desc = trying to send message larger than max (2487259 vs. 2097152)"
<details>

Anything else?

I'm deploying this chart using Helm (v3.9.0). It is working well, and the grpc is not an issue.

Let me know if more logs can help.

Thank you

moriahpopo avatar May 20 '24 08:05 moriahpopo

@moriahpopo maybe this will help you, in case you're deploying multiple versions of your chart in multiple clusters. Fleet seems to bundle all files needed from the chart in the Bundle object, which can make it way to big for the k8s API to handle. Please also check that you don't have any binaries/ sizeable files in your repository, as they are unfortunately also added to the bundle.

If you do, you could try excluding them using a .fleetignore file. Maybe that will do the trick ;)

puffitos avatar May 23 '24 14:05 puffitos

@puffitos Thanks for your comment. This is not the issue here. I don't have metadata or other files that I don't need as part of the deployment. In addition, I'm getting this error when I'm trying to deploy one version for only one cluster.

moriahpopo avatar May 26 '24 12:05 moriahpopo

The chart is probably too big. You could check by creating the bundle manually with fleet apply and then applying it: https://fleet.rancher.io/bundle-add#convert-a-helm-chart-into-a-bundle

In the future we want to support OCI as a storage, so charts could be bigger.

However, we should improve the error message in that case.

manno avatar May 29 '24 13:05 manno

Thanks @manno , I'll try

moriahpopo avatar Jun 02 '24 08:06 moriahpopo

No further comments.

Please check with latest version and reopen if the problem still happens.

kkaempf avatar Feb 12 '25 14:02 kkaempf