Polly
Polly copied to clipboard
Dynamic circuit break period
Initial work to allow for dynamically calculated CircuitBreaker break periods.
The issue or feature being addressed
See Issue: https://github.com/App-vNext/Polly/issues/653 ( @reisenberger )
Details on the issue fix or feature implementation
See Issue: https://github.com/App-vNext/Polly/issues/653 Note that there seem to be 2 contentious design choices outstanding; mentioned on that Issue.
w.r.t. Non-Positive inputs, I've gone with leaving static-duration behaviour unchanged, and having no checks for dynamic durations.
w.r.t. Delay durations when the response is slow to come back, I've gone with re-using the same delay, until one of the responses actually completes.
Several things still need to be done, before this is fully suitable for deployment, but they're all fairly rote changes, so I want to get the meat of the change approved before doing them, since if there are substantial markups the rote changes might need tweaks, which would be ... tedious :)
- Trivially apply all the same changes to the Factory Methods for other syntaxes (CircuitBreakerAsync, CBTReslt, CBTResultAsync)
- Figure out what (if anything?) should be done for the Advanced version of the CB. Then apply all the same boring Factory Method changes to all 4 version of that.
- Update existing tests for all of the above.
- Create new tests for the specific functionality for the duration to change dynamically (as opposed to the existing tests which all use constant-return functions. This ensures that the existing functionality is unchanged, and that the most basic use-case of the functionality works, but is somewhat lacking in depth.)
- Add general documentation (the XMLDocs are updated, but I assume there's a high-level doc somewhere that I should update too.
Confirm the following
FIXED: ~Branch state on your repo currently looks weird~
^ See above, the existing tests have been updated, and crudely exercise the new functionality. Additional tests are required, and I'll create them in due course; wanted to get the meat of the change up for someone to look at while I'm doing that.
- [x] I started this PR by branching from the head of the latest dev vX.Y branch, or I have rebased on the latest dev vX.Y branch, or I have merged the latest changes from the dev vX.Y branch
- [x] I have targeted the PR to merge into the latest dev vX.Y branch as the base branch
- [Y^] I have included unit tests for the issue/feature
- [Y] I have successfully run a local build
Note that you may well find it easier to review this commit-by-commit. The first commit is entirely mechanical copy-pasting; the ones after that are more interesting.
Thanks @Brondahl for the contribution! I am now on family/vacation time; I might get to look at this in the next couple days, but it could also be late December or January. Thanks.
👍 Enjoy your break. Thanks for the expectation-setting :)
EDIT: New status update below, as of 20/1/2020
Cool, so status update: I've applied almost all the review markups, locally.
Tasks still outstanding:
- (@reisenberger) Sign-off on general review discussions.
- (@reisenberger) Confirm desired Exception type for `durationFactory() returns < 0" review markup.
- (@Brondahl) Implement ☝️ Exception throw.
- (@reisenberger + @Brondahl) Decide on what we're going to do to the Advanced Circuit Breaker.
- (@Brondahl) All the bits of tedious TODOs from the original PR comment.
(@reisenberger + @Brondahl) Decide on what we're going to do to the Advanced Circuit Breaker.
So I see ... 3? ... options.
- Do nothing.
- Assert that the Proportion-based approach of Adv.C.B. is incompatible with dynamic Half-Open durations, and leave it as it is.
- :(
- Port identical functionality over.
- Argue that the Advancedness of the Adv.C.B is solely about the details of when you intially call
Break_NeedsLock()
in the Controller, whilst in theClosed
state, and the behaviour of theHalfOpen
state is no different. - Simples!
- Argue that the Advancedness of the Adv.C.B is solely about the details of when you intially call
- Create Advanced dynamic Breaks!?
- Try to invent something that is analogous to the proportional healthchecks of the Closed state, and apply that to the Half-Open state.
- Uhh ... sounds a bit rough.
My vote is (probably unsurprisingly) option No.2. Opinions?
@Brondahl Over to you to push all the changes so far, I think I'm done on these:
(@reisenberger) DONE Sign-off on general review discussions. (@reisenberger) DONE Confirm desired Exception type for `durationFactory() returns < 0" review markup.
(Might have small comments on ExtensionUpToMaxValue()
once pushed.)
Re:
(@reisenberger + @Brondahl) Decide on what we're going to do to the Advanced Circuit Breaker.
Great overview of options ^^ 👍 . Reflecting on this.
(@reisenberger + @Brondahl) Decide on what we're going to do to the Advanced Circuit Breaker My vote is (probably unsurprisingly) option No.2.
Let us implementation option 2 at this time. Proportion-based health-checks for the half-open state (option 3) is a larger build, and requires wider consideration of what, conceptually, makes sense.
👍 Will update from the small comments above and push. Then start the rote changes. Will probably hold off writing completely new unit tests, and discuss them with you before I start to check that I'm approaching it correctly.
@reisenberger I've committed all the tweaks to the above, and the core change for the Adv. CB. Shall I raise that as a new bnanch and new PR based off of this original branch, or just push it as a change to this branch? Which way to you prefer to view incremental changes on a PR?
just push it as a change to this branch
... is fine. Thanks!
Heya, sorry for the delay. work got a little hectic as a project drew to a close. Back on this now.
Have just pushed the aforementioned review markups and Implementation for Adv. CB.
Am working on the rote changes and similar.
@reisenberger I've completed the rote changes (adding the new options to all the syntaxes, and extending all the tests to run with static duration and also "dynamic" duration (i.e. where the Func<int, TimeSpan>
syntax is used, but it always returns the same value.
Do you want to shout when you've reviewed the previous push ☝️ , and then I'll add these on top of it.
Alternatively you might just want to review my approach to the Test file updating. I used a couple of regexes to Find'N'Replace all the changes
Make all the tests Theories
, with true
and false
values:
[Fact]
=> [Theory, InlineData(true), InlineData(false)]
(not actually as a regex, just literals.)
Make all the tests parameterised to accept those values:
( *public (void|async Task) \w*)\(\) *
=>
$1(bool timespanIsDynamic)
(and then manually fix the Dispose()
method)
Call the extensions and pass the parameter:
\.(Advanced)?CircuitBreaker(Async)?\((.*)\);
=>
.$1CircuitBreaker$2_WithConditionalTimeSpan($3, timespanIsDynamic);
(plus some whitespace fixing for multi-line invocations in AdvancedCBSpecs.cs
)
Notably, it turns out you've neatly got exactly one CircuitBreaker call per-Test, and the Regexes found the same number of hits as the Fact
-> Theory
change, so I'm reasonably confident I caught everything.
EDIT: New status update below, as of 31/1/2020
Remaining actions:
- @reisenberger to re-review original markups (currently pushed)
- @reisenberger to shout if/when you want the rote changes pushed for review.
- @Brondahl to prepare wiki-doc updates.
- @Brondahl to figure out what specific testing is needed, and write up.
- @reisenberger to review test-plan.
- @Brondahl to implement testing, and progress from there.
@Brondahl Thanks for this! I will review and get back to you (likely twds the end of the week or weekend)
Heya,
Had a proper look at the tests, and I don't think the tests for dynamic duration are too complex. As a precursor, I'm going to factor out the most common, repetitive lines of the tests, after which the dynamic duration tests will look a bit like this:
[Fact]
public void Should_open_circuit_for_appropriate_dynamic_duration_when_call_raises_exception_repeatedly_in_sequence()
{
var startTime = 1.January(2000);
var currentTime = startTime;
SystemClock.UtcNow = () => currentTime;
_breaker = Policy
.Handle<DivideByZeroException>()
.CircuitBreaker(2, (numberOfErrors) => TimeSpan.FromMinutes(numberOfErrors + 1));
ExpectFailureCall_ToResultInPassThroughError____EndingUpInState(CircuitState.Closed);
ExpectFailureCall_ToResultInPassThroughError____EndingUpInState(CircuitState.Open);
ExpectFailureCall_ToResultInCircuitBreakerError_EndingUpInState(CircuitState.Open);
// when initial duration has not passed, circuit still open
currentTime = startTime.AddMinutes(1).AddMilliseconds(-1);
ExpectState(CircuitState.Open);
// when initial duration has passed, circuit now half-open
currentTime = startTime.AddMinutes(1);
ExpectState(CircuitState.HalfOpen);
// Further error opens it back up, starting a new duration
ExpectFailureCall_ToResultInPassThroughError____EndingUpInState(CircuitState.Open);
var reopenTime = currentTime;
// after initial duration has passed, circuit is *still* open, because the new duration is longer.
currentTime = reopenTime.AddMinutes(2).AddMilliseconds(-1);
ExpectState(CircuitState.Open);
// once newer duration has passed, circuit is now half-open again.
currentTime = reopenTime.AddMinutes(2);
ExpectState(CircuitState.HalfOpen);
}
Any concerns? Any particular aspects of the functionality that you want me to ensure are being tested, which you think I might forget?
Thanks @Brondahl ! I'll aim to join thinking on specific test plans for dynamic-circuit-break-period either Wed or Thurs.
Hi @Brondahl
Proving existing behaviour unaffected
It looks like most of the existing tests on circuit-break duration are in a section marked Circuit-breaker open->half-open->open/closed tests
. There is also a separate test in the callback-delegates section about the break-duration being passed to the callback delegate: Should_call_onbreak_with_the_correct_timespan
.
For these existing tests, we would want variants checking that existing circuit-break-duration behaviour is not affected. IE If the tests do not already cover it, we should extend existing tests/add new tests confirming that when using policy configuration overloads with constant durationOfBreak, the duration of break indeed remains constant when the circuit repeatedly transitions back to open from half-open.
New tests
For any of the existing tests ^ which focused on break-duration, I guess we would want counterparts for the dynamic case.
Tests on orthogonal matters
If I've understood the suggestion correctly:
Make all the tests Theories, with true and false values: [Fact] => [Theory, InlineData(true), InlineData(false)] (not actually as a regex, just literals.)
I don't think we need to do this. Where a circuit-breaker tests is testing something entirely orthogonal to the changes you have made (that it breaks on the right exceptions; whatever), there's no need to duplicate these for both static and dynamic break durations. If we'd taken that approach/took that approach going forward for each new feature added to circuit-breaker, we'd soon end up with a 2 ^ N explosion of tests where N is the number of new features.
I don't think we need to do this. Where a circuit-breaker tests is testing something entirely orthogonal to the changes you have made (that it breaks on the right exceptions; whatever), there's no need to duplicate these for both static and dynamic break durations.
I can see that argument.
I took my cue from the fact that each different variant on the *CircuitBreaker*Syntax, does have a largely duplicated collection of tests (i.e. CircuitBreakerSpecs
, CircuitBreakTResultSpecs
, CircuitBreakerAsyncSpecs
.)
Most of the tests in those files aren't testing things that are unique to the differences from the base implementation; they're just testing "this feature also works in the context of there being a return value in the policy", etc.
Similarly, all the tests are re-run for all the .NET versions.
(Plus, I assumed you were in favour since you didn't comment on it in the initial review ;) )
I agree that the majority of that duplication is highly unlikely to excercise any new edge-cases.
For the avoidance of doubt, given that that change has been coded up for all Test Files and all tests pass, do you want me to remove those changes? Happy to do so, if you'd still like me to, given the above :)
It looks like most of the existing tests on circuit-break duration are in a section marked Circuit-breaker open->half-open->open/closed tests. [snip] For these existing tests, we would want variants checking that existing circuit-break-duration behaviour is not affected. IE If the tests do not already cover it, we should extend existing tests/add new tests confirming that when using policy configuration overloads with constant durationOfBreak, the duration of break indeed remains constant when the circuit repeatedly transitions back to open from half-open.
For any of the existing tests ^ which focused on break-duration, I guess we would want counterparts for the dynamic case.
Yup, agree with all that.
There is also a separate test in the callback-delegates section about the break-duration being passed to the callback delegate: Should_call_onbreak_with_the_correct_timespan.
Ah hah! Yes, that's the sort of thing I think I would have missed :) Good catch; thanks!
LMK about removing the duplication of existing tests, and then I'll implement the extensions and new tests.
Thanks @Brondahl ! Yes, I think we should remove the dual-condition running of existing tests modelled with [Theory, InlineData(true), InlineData(false)]
. But thanks for suggesting it; I understand the motivation :+1:
Thanks @Brondahl for everything on this :+1: . Re:
@reisenberger to re-review original markups (currently pushed)
DONE - all good.
@reisenberger to shout if/when you want the rote changes pushed for review.
READY
@reisenberger to review test-plan.
DONE - as above
Rote changes to other syntaxes pushed.
As previously the first commit (1974636a
) is duplicating the existing methods verbatim, so that the second commit (fc80d1368
) has the interesting diffs. So you might want to exclude that commit from the diff you're looking at.
Updates to tests are not currently included since I need to revert the test duplication, as above. Once that's complete I'll push those changes, but since they're independant files, I don't think that should block review of the Syntax files.
Actions:
- @reisenberger to review rote syntax changes.
- @Brondahl to revert dynamic/static test duplication.
- @Brondahl to implement test refactoring across remaining test files.
- @Brondahl to implement new tests as per test plan above.
- @Brondahl to push all above test changes
- @reisenberger to review test changes.
- @Brondahl to prepare wiki-doc updates.
@reisenberger to review rote syntax changes.
DONE :+1: All good save minor parameter-naming things prev ^ mentioned. Thanks @Brondahl for all the great attention to detail on this PR!
Heya, Sorry about the radio silence; things got busy at work and in life in general :)
I've been gradually applying the general refactoring to the tests, to commonise the repetitive bits. That's pretty slow going though. :(
So I'm going to put that down, and just apply the testing for ☝️ without that commonisation. I'll raise a separate PR for the test commonisation (Like the param renaming), since I think that's worth cleaning up, still.
Hopefully I'll be able to get the Tests up by the end of the weekend.