Shoot resource status.lastOperation.description can get too large (easily reach resource size limit)
What happened:
We have seen, e.g. when people fail to update their credentials, that the error message repeats in status.lastOperation.description, e.g. for each node separately (observed by @mliepold @kon-angelo), driving the size of the shoot resource beyond the permitted limit, breaking further automated updates that do not strip the status first.
What you expected to happen: It seems reasonable to prevent status updates from blowing up the resource in this way and either do not repeat the same message (may or may not be easy to achieve or at least truncate the field if it gets "too" large). @kon-angelo suggested:
It should be handled for all extensions and most of their code reside in the form of libraries in gardener/gardener.
How to reproduce it (as minimally and precisely as possible): Please contact the above mentioned colleagues.
I added the label enhancement rather than bug, but this behaviour may feel to the affected people more like a bug, because they a.) rarely know what the error messages means and b.) don't know what to do then. This is what we have observed. They then start complaining and asking for help.
@timuthy Shouldn't we rather not consider the .status of a Shoot when users change a resource? They cannot modify the .status section anyways, so we shouldn't bother them when they try to change something else just because a controller "polluted" the .status part?
Yep, makes sense 👍
We could combine this enhancement with a relaxed validation for last-applied-configuration and managed-fields fields, as discussed as while ago.
The Gardener project currently lacks enough active contributors to adequately respond to all issues. This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Mark this issue as rotten with
/lifecycle rotten - Close this issue with
/close
/lifecycle stale
The Gardener project currently lacks enough active contributors to adequately respond to all issues. This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten - Close this issue with
/close
/lifecycle rotten
The Gardener project currently lacks enough active contributors to adequately respond to all issues. This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Reopen this issue with
/reopen - Mark this issue as fresh with
/remove-lifecycle rotten
/close
@gardener-ci-robot: Closing this issue.
In response to this:
The Gardener project currently lacks enough active contributors to adequately respond to all issues. This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied- After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied- After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closedYou can:
- Reopen this issue with
/reopen- Mark this issue as fresh with
/remove-lifecycle rotten/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.
@LucaBernstein would you like to take this?
Sure @timuthy, I can have a look. Thanks!
/assign
/remove-lifecycle rotten