[Feature request]: Resilience Event occured logs should be configurable as Warning unless it's the final attempt that failed
Is your feature request related to a specific problem? Or an existing feature?
Polly reports Resilience Event occurred .. logs as Error, even when it's the first attempt that failed and multiple attempts will still be made.
This makes the log show up as an error, possibly setting off alerts even though it isn't really an error until the retry mechanisms have truly failed.
Describe the solution you'd like
I would like to be able to configure failed retries to get logged as Warning. When the final attempt fails, that could then be logged as an error.
This means that I will only see true errors (that might need investigation) in my error screen.
Additional context
No response
Have you tried doing this with the changes from #2072 that are in Polly 8.4.0?
I seem to have overlooked that one. I just had a look, but it seems that you can only set the severity based on the (original) Severity and the EventName. I do not think that is enough to implement this feature request, as I would also need something like IsFinalAttempt or Attempt and MaxAttempts.
Also, I'm using Microsoft.Extensions.Http.Resilience instead of a direct dependency on Polly. I do not see a way to do access the TelemetryOptions in the same way when using that package. However, that's an issue related to that package and not Polly directly.
@martincostello
I am thinking that this might be even good default behavior for retry strategy? Individual attempts are warnings while the final one is error?
I guess that's a reasonable if we have access to the context to know if it's the last attempt.
This issue is stale because it has been open for 60 days with no activity. It will be automatically closed in 14 days if no further updates are made.
Any update on this? Would it be feasible and a useful addition?
/Cc @martincostello @martintmk
No update from us. Feel free to open a pull request to propose this change, then we can take a look.
I'll be off in the next two weeks, but I can start work on this request on the 16th of September week if that's not too late.
As I can see no one has jump on this request in my absence so, @martincostello please assign this ticket to me.
The current state
OnRetry
The OnRetry telemetry events are always reported as Warning:
https://github.com/App-vNext/Polly/blob/957444f8a3ce923b298e9dee1ec653447ec4091d/src/Polly.Core/Retry/RetryResilienceStrategy.cs#L81
- These are reported after a failed and handled attempt but before the next attempt.
- That means this event will not occur after the final attempt.
- Also please bear in mind these events start with the
Resilience Event occurred ...message
ExecutionAttempt
The ExecutionAttempt telemetry events are either reported as Warning or as Information whether the outcome is handled or not (respectively):
https://github.com/App-vNext/Polly/blob/957444f8a3ce923b298e9dee1ec653447ec4091d/src/Polly.Core/Retry/RetryResilienceStrategy.cs#L60
https://github.com/App-vNext/Polly/blob/957444f8a3ce923b298e9dee1ec653447ec4091d/src/Polly.Core/Telemetry/TelemetryUtil.cs#L27
- These are reported just before the next delay is calculated.
- More precisely before the strategy decides whether it should have a new attempt or not.
- So, this event does occur after the final attempt.
- These events start with the
Execution attempt. Source:...message
The desired state
So, what we could do is to change the ExecutionAttempt event's severity to Error if we run out of attempts. In other words, if the execution is cancelled or the final attempt's outcome is not handled then the severity will be either Warning or Information (respectively).
Basically changing this logic
https://github.com/App-vNext/Polly/blob/957444f8a3ce923b298e9dee1ec653447ec4091d/src/Polly.Core/Retry/RetryResilienceStrategy.cs#L60-L65
to something like this:
var isLastAttempt = IsLastAttempt(attempt, out bool incrementAttempts);
if (isLastAttempt && handle)
{
TelemetryUtil.ReportFinalExecutionAttempt(_telemetry, context, outcome, attempt, executionTime, handle);
}
else
{
TelemetryUtil.ReportExecutionAttempt(_telemetry, context, outcome, attempt, executionTime, handle);
}
if (context.CancellationToken.IsCancellationRequested || isLastAttempt || !handle)
{
return outcome;
}
@sander1095 Is this what you are looking for?
Sounds good, @peter-csala . I am pretty sure I did see resilience logged as error in non-error cases, but that might be related to Microsoft.Extensions.Http.Resilience and not Polly directly