gardener icon indicating copy to clipboard operation
gardener copied to clipboard

Shoot resource status.lastOperation.description can get too large (easily reach resource size limit)

Open vlerenc opened this issue 2 years ago • 4 comments

What happened: We have seen, e.g. when people fail to update their credentials, that the error message repeats in status.lastOperation.description, e.g. for each node separately (observed by @mliepold @kon-angelo), driving the size of the shoot resource beyond the permitted limit, breaking further automated updates that do not strip the status first.

What you expected to happen: It seems reasonable to prevent status updates from blowing up the resource in this way and either do not repeat the same message (may or may not be easy to achieve or at least truncate the field if it gets "too" large). @kon-angelo suggested:

It should be handled for all extensions and most of their code reside in the form of libraries in gardener/gardener.

How to reproduce it (as minimally and precisely as possible): Please contact the above mentioned colleagues.

vlerenc avatar Jan 13 '24 15:01 vlerenc

I added the label enhancement rather than bug, but this behaviour may feel to the affected people more like a bug, because they a.) rarely know what the error messages means and b.) don't know what to do then. This is what we have observed. They then start complaining and asking for help.

vlerenc avatar Jan 13 '24 15:01 vlerenc

@timuthy Shouldn't we rather not consider the .status of a Shoot when users change a resource? They cannot modify the .status section anyways, so we shouldn't bother them when they try to change something else just because a controller "polluted" the .status part?

rfranzke avatar Jan 19 '24 12:01 rfranzke

Yep, makes sense 👍

timuthy avatar Jan 19 '24 12:01 timuthy

We could combine this enhancement with a relaxed validation for last-applied-configuration and managed-fields fields, as discussed as while ago.

timuthy avatar Jan 19 '24 12:01 timuthy

The Gardener project currently lacks enough active contributors to adequately respond to all issues. This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Mark this issue as rotten with /lifecycle rotten
  • Close this issue with /close

/lifecycle stale

gardener-ci-robot avatar Apr 18 '24 13:04 gardener-ci-robot

The Gardener project currently lacks enough active contributors to adequately respond to all issues. This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close

/lifecycle rotten

gardener-ci-robot avatar May 18 '24 13:05 gardener-ci-robot

The Gardener project currently lacks enough active contributors to adequately respond to all issues. This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten

/close

gardener-ci-robot avatar Jun 17 '24 14:06 gardener-ci-robot

@gardener-ci-robot: Closing this issue.

In response to this:

The Gardener project currently lacks enough active contributors to adequately respond to all issues. This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

gardener-prow[bot] avatar Jun 17 '24 14:06 gardener-prow[bot]

@LucaBernstein would you like to take this?

timuthy avatar Jun 17 '24 19:06 timuthy

Sure @timuthy, I can have a look. Thanks!

/assign

LucaBernstein avatar Jun 18 '24 08:06 LucaBernstein

/remove-lifecycle rotten

oliver-goetz avatar Jun 18 '24 15:06 oliver-goetz