kapp Apply as many changes as possible with error summary at end

Apply as many changes as possible with error summary at end

Open praveenrewar opened this issue 2 years ago • 13 comments

What this PR does / why we need it:

Currently kapp errors out as soon as it detects an error while apply a resource or waiting for a resource, but we would want to apply as many changes as possible and return all the errors at end. If a change doesn't depend on a failed change then it should get applied.

Sample gist for testing: https://gist.github.com/praveenrewar/2746318c5a273d13ed26497f34755638

TODO:

[x] Add e2e test
[x] Do more manual testing

Which issue(s) this PR fixes:

Fixes #426

Does this PR introduce a user-facing change?

Additional Notes for your reviewer:

Review Checklist:

[x] Follows the developer guidelines
[x] Relevant tests are added or updated
[ ] Relevant docs in this repo added or updated
[ ] Relevant carvel.dev docs added or updated in a separate PR and there's a link to that PR
[ ] Code is at least as readable and maintainable as it was before this change

Additional documentation e.g., Proposal, usage docs, etc.:

Jun 18 '22 10:06 praveenrewar

Should there be a flag which enables disables this behaviour? What should be it's default value?

Hmm, maybe, but I am not sure why someone would want to turn it off.

How does this look with multiple failures?

It's just like you have mentioned 😁 please do take a look at the gist that I have shared, it contains an example manifest and the output that we would get while deploying it.

Jun 21 '22 20:06 praveenrewar

Hmm, maybe, but I am not sure why someone would want to turn it off.

One case I can think of is pipelines. They would probably want to error out the moment something goes south rather than wait for other resources to be created. Some packages might be interested in this I guess, but I do not see a super strong case for it. (Not sure if kapp-controller will choose to error out by default because the installation fails none the less)

It's just like you have mentioned 😁 please do take a look at the gist that I have shared, it contains an example manifest and the output that we would get while deploying it.

Oooh noice! Missed that. I noticed there was a ']' added to a test and misunderstood 🤔

Jun 22 '22 04:06 100mik

They would probably want to error out the moment something goes south rather than wait for other resources to be created.

I don't know how that would be useful, but maybe it would be useful in some scenarios that I am not able to think at the moment, so I am keeping this discussion open for now, we can discuss more later.

I noticed there was a ']' added to a test and misunderstood

Yeah, I am still trying to figure out how to best use NewSemiStructuredError. Right now it formats properly when there are more errors (>1), but if there is just one error then it keeps the brackets. I will look more into it.

Jun 22 '22 07:06 praveenrewar

we should probably have a flag that fails on first failure (current behaviour) so that folks can still "quickly" exit if that's what they want.
how does this feature play with timeout errors? should we be erroring on timeout errors immediately?

Jul 05 '22 18:07 cppforlife

we should probably have a flag that fails on first failure (current behaviour) so that folks can still "quickly" exit if that's what they want.

Yeah, that's something @100mik had suggested as well. I had kept this discussion open till now, I will add a flag to disable the behaviour.

how does this feature play with timeout errors?

Right now, we error out for --wait-timeout but not for --wait-resource-timeout.

should we be erroring on timeout errors immediately?

I was thinking that individual resource timeouts can be treated as resource errors in this case, as one resource getting timed out doesn’t mean that others (which don't depend on it) will also get timed out.

Jul 05 '22 18:07 praveenrewar

The changes themselves LGTM 🤔

Re: Error structuring I think we are:

Retaining how we used to show errors for one error by using the len(unsuccessfulChanges) == 1 check
However multiple errors won't have the formatting due to capitalisation?

Jul 25 '22 21:07 100mik

Suggestion: I think it would be good if you can add little more description like: how the new output will look like? You can add some example output.

Jul 26 '22 11:07 kumaritanushree

Retaining how we used to show errors for one error by using the len(unsuccessfulChanges) == 1 check

That's correct.

However multiple errors won't have the formatting due to capitalisation?

That doesn't happen in NewSemistructuredError.

Jul 26 '22 11:07 praveenrewar

Suggestion: I think it would be good if you can add little more description like: how the new output will look like? You can add some example output.

I think the test cases and the gist that I have provided should be a good indication of the output, but do let me know if I should add more examples.

Jul 26 '22 11:07 praveenrewar

So I am guessing multiple errors will be unformatted 🤔 Also, I m not entirely sure if the way we format a single message will help in case of multiple messages. But I think we should explore if we have any options better than having it in one line.

That said I do not think it is a major blocker but more of a nice to have thing imo 🚀

Jul 26 '22 11:07 100mik

wouldn't it be better to show 2 Resource Succeded before showing errors or final summary at the end on number of resources succeeded and failed.

Hmm, definitely a good thought. The reason I didn't include more details around successful changes at the end is because we already include enough information in the logs (ok: reconcile...). I also was trying to keep the ui changes to the minimum, but I am open to creating a separate PR if we want to enhance the experience a bit further later on.

Jul 27 '22 17:07 praveenrewar

I think I am satisfied with how the UI changes look as well, I cannot think of how we can make a list of errors look better right away. But these cosmetic improvements can always go as a separate PR if we have better ideas.

Aug 01 '22 08:08 100mik

Also, @praveenrewar looks like this branch needs a rebase. I will take one final look at this once it is done!

Aug 01 '22 08:08 100mik

kapp kapp copied to clipboard

Apply as many changes as possible with error summary at end

What this PR does / why we need it:

Which issue(s) this PR fixes:

Does this PR introduce a user-facing change?

Additional Notes for your reviewer:

Review Checklist:

Additional documentation e.g., Proposal, usage docs, etc.:

kapp
kapp copied to clipboard