[GEP-713] How to reflect policy conflicts in Status Conditions?
I'm wondering how to best report policy conflicts using status conditions according to GEP-713.
Question 1: The Target object status section of GEP-713 mentions "status Enforced or PartiallyEnforced", but "PartiallyEnforced" does not appear elsewhere in the GEP, and "enforced" is a common adjective (typed without backticks and capitalization) everywhere else. Is the mention of Enforced and PartiallyEnforced an omission? Or should they be fully described as status conditions alongside Accepted and Programmed? In an old version @guicassolato's version of the GEP, both Enforced and PartiallyEnforced are valid status conditions alongside Accepted (and Overridden).
Question 2: Let's assume Enforced or PartiallyEnforced should not exist. I'd like to understand the intended usages for
Accepted: ConflictedversusProgrammed: PartiallyProgrammed.
2.1. I see that Conflicted indicates that the entire policy is rejected, while PartiallyProgrammed indicates that only parts of it don't take effect. But why do these indications of entirety and partialness appear at different levels (Accepted vs Programmed)? Shouldn't we also have Accepted: PartiallyConflicted or Programmed: FullySuperseded?
2.2 I suppose one intended use case for PartiallyProgrammed is the following: Gateway-level policy sets color: red, opacity: 50, while a Route-level policy sets color: blue (I assume a linear Gateway -> HTTPRoute -> Service topology). Then, the effective policy (assuming the Patch overrides merge strategy) is color: blue, opacity: 50 because color from the more specific Route-level policy takes precedence. In this case, the Gateway-level policy is PartiallyProgrammed because opacity takes effect but color does not.
2.3 Following up on the above example. What if there is another Backend-level policy that sets opacity: 25? I assume the effective policy should then be color: blue, opacity: 25, which means that the entire Gateway-level policy is superseded by other policies. Should it still be reported as PartiallyProgrammed (or rather FullySuperseded)? Personally, I don't see much value in controllers monitoring and reporting whether already the entire policy is superseded or there are still some parts that are effective (assuming that all overrides are mentioned in the Status Condition Message), but want to make sure.
2.4. What's the intended use case for Conflicted?
Hi @DamianSawicki. Thanks for reporting the issue!
Question 1
You are right. Enforced and PartiallyEnforced are leftovers from a previous (not merged) version of the GEP. Instead, wherever those appear, like this, in backticks, one should read Programmed and PartiallyProgrammed, respectively. I believe that section you pointed is the only case.
Question 2
2.1 – Acceptance and programming are different phases of reconciling a policy resource typically. Acceptance refers to validations performed by the controller, such as if the target exists and if the policy is semantically valid, or in this case if the policy resource should even exist depending on whether it conflicts to another given the implemented and supported merge strategies. Programming is about configuring the internal resources that are needed for the enforcement of the policy; i.e., the policy is valid and semantically correct, however it may or may not be fully programmed for enforcement depending on the state of other policies (and their merge strategies).
In 2.2, you exemplified the use case for PartiallyProgrammed very well. The policy is accepted, however it should not be expected to be fully programmed due to parts of it being overridden by other policies.
In 2.3, the Gateway-level policy should report Accepted: True, Programmed: False (Overridden).
The main value of PartiallyProgrammed between Programmed and Overridden IMO is to hint users on a status of a policy resource which typically requires more explanation. I.e. what parts of the policy have been overridden specifically. It's a condition that in a scenario of only atomic merges supported could otherwise be solved simply by pointing to another policy resource in the status message, but in this case the user will want to dig a little deeper.
2.4 – Conflicted is reserved for policy kinds that only implement the None merge strategy. It is a final state condition that belongs to the acceptance phase, and it's a way for a controller to signal that the (conflicting) policy resource will not be programmed nor partially programmed due to not even being accepted – not because it was overridden (which is a property of policies that can be merged to others), but because the policy cannot be merged. That's what I meant by "another policy that targets the same resource and a merge is not possible." I could have been more clear in the writing there.
Thanks for such a quick and thorough reply! Ok, so what I called FullySuperseded above actually exists and is called Overridden, my bad.
Let me summarize then:
- The
Nonestrategy is very special. Conflict resolution in this case is not called merging (despiteNonebeing a merge strategy; we may need to work on the wording here). The winning policy hasAccepted: Trueand all losing policies haveAccepted: False (Conflicted). - For all other merge strategies, a conflict does not suffice to turn
AcceptedtoFalse, and we should use theProgrammedcondition to inform about the conflict.- For atomic strategies,
PartiallyProgrammeddoes not apply, it's eitherProgrammedorOverridden. - For patch strategies,
Programmed: True (PartiallyProgrammed)should be used in case 2.2 from Issue description andProgrammed: False (Overridden)should be used in case 2.3.
- For atomic strategies,
Is this understanding correct?
Personally, for patch strategies I find the difference between 2.2 and 2.3 very small, and imho it should not justify the seemingly large difference between Programmed: True and Programmed: False. It also requires extra scrutiny from the controller to monitor whether a policy is still 99% superseded (Programmed: True (PartiallyProgrammed)) or already 100% superseded (Programmed: False (Overridden)). I'd prefer to use something like AtomicallyReplaced for atomic strategies and (Partially)Overridden for patch strategies (applicable to both 99% and 100% superseded cases).
I was chatting with @DamianSawicki separately from this and recommended that we should add two conditions:
Overridden: The policy has been overridden in some way (inclusive of partially and fully) - reasons would be more descriptive.Overriding: The policy has overridden other policie(s) in some way (inclusive of partially and fully) - reasons would be more descriptive.
IMO, this feels a bit simpler to follow, but open to feedback. I'll add this to the agenda for today's community meeting.
cc @youngnick @rikatz
Today's meeting is at a Europe unfriendly time, so thanks for offering to follow up, Rob!
To give some rationale: If you think of Accepted: False, Reason: Conflicted as a kind of an error message on the superseded policy, it would be good to also have a symmetric warning message on the superseding policy. Apart from use cases from GEP-713, a use case I have in mind is a user creating an instance of a new policy type conflicting with a long-existing-and-forgotten instance of another soft-deprecated policy type. Here's where Overriding would be very useful.
After discussing today in the meeting, and some thought, I agree with @robscott about Overridden and Overriding, with Reasons used as the way to indicate if that override is partial, complete, atomic, etc.
After discussing today in the meeting, and some thought, I agree with @robscott about Overridden and Overriding, with Reasons used as the way to indicate if that override is partial, complete, atomic, etc.
From this message and reactions on it, it seems we're reaching consensus here, which is awesome! I volunteer to draft a PR for these changes.