add warehouse status field showing how discovery was initiated
Can make troubleshooting easier otherwise you can conflate polling refreshes from webhook initiated ones
Will require some backend and frontend work
UI aside, can you be more specific about what you foresee attaching this metadata to?
One idea could be LastRefreshTriggeredBy where possible enum values could be webhook and polling probably makes more sense to have this as a status field.
I know what you're trying to say, but want to be more precise about terminology so we don't create confusion. A Warehouse's only behavior is artifact discovery. It happens periodically. "Refresh" is only a signal to discover immediately. The signal takes the form of an annotation. (This is a common pattern and not unique to Kargo.)
I think what you mean by "polling" is "regularly scheduled discovery." But to say this triggered a refresh is inaccurate. Refresh is a signal and no such signal was involved. So the whole issue probably needs to be reframed as "surface metadata ~~to UI~~ showing how discovery was initiated."
Broadly speaking, there are two ways discovery can be triggered:
- Regularly scheduled discovery
- A refresh (signal)
The amount of work required to record which of these two precipitated the most recent discovery is very small. However, because you mentioned "webhook" as a possible value, I assume you'd actually like to differentiate between three possibilities:
- Regularly scheduled discovery
- Discovery requested by a user
- Discovery triggered by a webhook
Distinguishing between 2 and 3 is is a little bit more work and would ultimately be unreliable. Whatever scheme we might settle on for distinguishing between the two would be too easily defeated. Anyone with adequate permissions to set the refresh annotation can just as easily set other fields, annotations, etc. to whatever values are required to spoof the source of the refresh signal.
So, I guess what it all comes down to here is exactly how fine-grained you want to be here and whether the effort required is something we can justify for whatever reasons, with that justification also outweighing the unreliability.
I'm not squashing this. I'm just enumerating considerations that need to be weighed before we decide anything.
I’m happy to work on this! Could you please assign the issue to me?
@aryasoni98 you're volunteering for an awful lot of issues lately. We appreciate your enthusiasm, but I need to respectfully request that you "pump the brakes." No one should be working on any issues until maintainers have weighed in on the validity of the issue and approved an approach.
If an issue is labeled with kind/proposal the issue is, unsurprisingly, still a proposal. i.e. There's no commitment to it. Not having a priority or milestone assigned is another sign that we haven't come to conclusions about the issue yet. If an issue is labeled with needs discussion, which is quite deliberately red to prompt caution, it's a clear signal that at least one maintainer is pumping the brakes and really wants more discussion to occur before anyone acts on the issue.
As the community becomes more and more active, both in terms of what they are proposing and in terms of volunteering to help, we are trying very hard to avoid effort wasted on things we may be unwilling to merge and we're especially trying to avoid short-sighted implementations that may force breaking changes on us when we attempt to improve/correct them at a later date. We also have a strong preference that issues we know to be very complex be handled only by maintainers.
Again, we really appreciate the enthusiasm. If you wish to contribute, I strongly recommend focusing on issues that are labeled as help wanted, good first issue, size/small, or size/tiny.
As a take-away for myself, I will add descriptions to all the labels so that the conventions I have described are more clear.
Thanks!
Discussed in today's weekly issue triage meeting. @fuskovic indicated that differentiating between last discovery having been triggered by a refresh (for any reason) vs having occurred at its usual would be sufficient.
@fuskovic we probably still need to discuss implementation details. If I recall current implementation details correctly, it's possible for the Warehouse reconciler to respond to a refresh annotation, clear it, fail at discovery, and requeue the Warehouse for another reconciliation attempt with the refresh annotation now gone.
If I'm right (might not be), we will need to account for that somehow, or else the new information we're surfacing may frequently be wrong. This might force us to clear the refresh annotation at a different point in the process than we do currently.
If I recall current implementation details correctly, it's possible for the Warehouse reconciler to respond to a refresh annotation, clear it, fail at discovery, and requeue the Warehouse for another reconciliation attempt with the refresh annotation now gone.
Flow appears to be:
-
The refresh predicate evaluates reconciliation is needed by comparing object version values : https://github.com/akuity/kargo/blob/main/pkg/controller/warehouses/warehouses.go#L208
-
Comparing the annotation value with the status; mismatches meaning discovery should occur: https://github.com/akuity/kargo/blob/main/pkg/kargo/kargo.go#L73 + https://github.com/akuity/kargo/blob/main/pkg/controller/warehouses/warehouses.go#L819
Not seeing anywhere that the annotation value ever gets cleared. In the event of a failed discovery( or any other error), we still end up setting and patching status.LastHandledRefresh with the new annotation value.
If I'm right (might not be), we will need to account for that somehow, or else the new information we're surfacing may frequently be wrong. This might force us to clear the refresh annotation at a different point in the process than we do currently.
Unless we are clearing somewhere I'm not aware of, the pattern of patching the status with the new value from the annotation ( regardless of any errors that arise during reconciliation ) still seems valid to me. If the annotation value is empty we default to regular service interval.
Updates to api.RefreshWarehouse would include accepting the enum value for the trigger as a param and calling patchAnnotations instead of patchAnnotation. I'm open to naming suggestions for things like AnnotationKeyDiscoveryRequestedBy and LastDiscoveryRequestedBy.
The amount of work required to record which of these two precipitated the most recent discovery is very small. However, because you mentioned "webhook" as a possible value, I assume you'd actually like to differentiate between three possibilities:
- Regularly scheduled discovery
- Discovery requested by a user
- Discovery triggered by a webhook
Distinguishing between 2 and 3 is is a little bit more work and would ultimately be unreliable. Whatever scheme we might settle on for distinguishing between the two would be too easily defeated. Anyone with adequate permissions to set the refresh annotation can just as easily set other fields, annotations, etc. to whatever values are required to spoof the source of the refresh signal.
I'd like to discuss this offline
@fuskovic you're right. Now that I have it in front of me, the annotations aren't ever cleared.
I think @hiddeco mentioned, in passing, some concerns about feasibility of this feature. Let's hear him out before we get start implementing anything.