gardener Shoot resource status.lastOperation.description can get too large (easily reach resource size limit)

What happened: We have seen, e.g. when people fail to update their credentials, that the error message repeats in status.lastOperation.description, e.g. for each node separately (observed by @mliepold @kon-angelo), driving the size of the shoot resource beyond the permitted limit, breaking further automated updates that do not strip the status first.

What you expected to happen: It seems reasonable to prevent status updates from blowing up the resource in this way and either do not repeat the same message (may or may not be easy to achieve or at least truncate the field if it gets "too" large). @kon-angelo suggested:

It should be handled for all extensions and most of their code reside in the form of libraries in gardener/gardener.

How to reproduce it (as minimally and precisely as possible): Please contact the above mentioned colleagues.

Jan 13 '24 15:01 vlerenc

I added the label enhancement rather than bug, but this behaviour may feel to the affected people more like a bug, because they a.) rarely know what the error messages means and b.) don't know what to do then. This is what we have observed. They then start complaining and asking for help.

Jan 13 '24 15:01 vlerenc

@timuthy Shouldn't we rather not consider the .status of a Shoot when users change a resource? They cannot modify the .status section anyways, so we shouldn't bother them when they try to change something else just because a controller "polluted" the .status part?

Jan 19 '24 12:01 rfranzke

Yep, makes sense 👍

Jan 19 '24 12:01 timuthy

We could combine this enhancement with a relaxed validation for last-applied-configuration and managed-fields fields, as discussed as while ago.

Jan 19 '24 12:01 timuthy

The Gardener project currently lacks enough active contributors to adequately respond to all issues. This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Mark this issue as rotten with /lifecycle rotten
Close this issue with /close

/lifecycle stale

Apr 18 '24 13:04 gardener-ci-robot

The Gardener project currently lacks enough active contributors to adequately respond to all issues. This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close

/lifecycle rotten

May 18 '24 13:05 gardener-ci-robot

The Gardener project currently lacks enough active contributors to adequately respond to all issues. This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten

/close

Jun 17 '24 14:06 gardener-ci-robot

@gardener-ci-robot: Closing this issue.

In response to this:

The Gardener project currently lacks enough active contributors to adequately respond to all issues. This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen

Mark this issue as fresh with /remove-lifecycle rotten

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Jun 17 '24 14:06 gardener-prow[bot]

@LucaBernstein would you like to take this?

Jun 17 '24 19:06 timuthy

Sure @timuthy, I can have a look. Thanks!

/assign

Jun 18 '24 08:06 LucaBernstein

/remove-lifecycle rotten

Jun 18 '24 15:06 oliver-goetz