Polly icon indicating copy to clipboard operation
Polly copied to clipboard

[Feature request]: Resilience Event occured logs should be configurable as Warning unless it's the final attempt that failed

Open sander1095 opened this issue 1 year ago • 5 comments

Is your feature request related to a specific problem? Or an existing feature?

Polly reports Resilience Event occurred .. logs as Error, even when it's the first attempt that failed and multiple attempts will still be made.

This makes the log show up as an error, possibly setting off alerts even though it isn't really an error until the retry mechanisms have truly failed.

Describe the solution you'd like

I would like to be able to configure failed retries to get logged as Warning. When the final attempt fails, that could then be logged as an error.

This means that I will only see true errors (that might need investigation) in my error screen.

Additional context

No response

sander1095 avatar Jun 24 '24 06:06 sander1095

Have you tried doing this with the changes from #2072 that are in Polly 8.4.0?

martincostello avatar Jun 24 '24 06:06 martincostello

I seem to have overlooked that one. I just had a look, but it seems that you can only set the severity based on the (original) Severity and the EventName. I do not think that is enough to implement this feature request, as I would also need something like IsFinalAttempt or Attempt and MaxAttempts.

Also, I'm using Microsoft.Extensions.Http.Resilience instead of a direct dependency on Polly. I do not see a way to do access the TelemetryOptions in the same way when using that package. However, that's an issue related to that package and not Polly directly.

sander1095 avatar Jun 24 '24 07:06 sander1095

@martincostello

I am thinking that this might be even good default behavior for retry strategy? Individual attempts are warnings while the final one is error?

martintmk avatar Jun 28 '24 15:06 martintmk

I guess that's a reasonable if we have access to the context to know if it's the last attempt.

martincostello avatar Jun 28 '24 15:06 martincostello

This issue is stale because it has been open for 60 days with no activity. It will be automatically closed in 14 days if no further updates are made.

github-actions[bot] avatar Aug 28 '24 01:08 github-actions[bot]

Any update on this? Would it be feasible and a useful addition?

/Cc @martincostello @martintmk

sander1095 avatar Aug 30 '24 10:08 sander1095

No update from us. Feel free to open a pull request to propose this change, then we can take a look.

martincostello avatar Aug 30 '24 10:08 martincostello

I'll be off in the next two weeks, but I can start work on this request on the 16th of September week if that's not too late.

peter-csala avatar Aug 30 '24 16:08 peter-csala

As I can see no one has jump on this request in my absence so, @martincostello please assign this ticket to me.

peter-csala avatar Sep 16 '24 13:09 peter-csala

The current state

OnRetry

The OnRetry telemetry events are always reported as Warning:

https://github.com/App-vNext/Polly/blob/957444f8a3ce923b298e9dee1ec653447ec4091d/src/Polly.Core/Retry/RetryResilienceStrategy.cs#L81

  • These are reported after a failed and handled attempt but before the next attempt.
    • That means this event will not occur after the final attempt.
  • Also please bear in mind these events start with the Resilience Event occurred ... message

ExecutionAttempt

The ExecutionAttempt telemetry events are either reported as Warning or as Information whether the outcome is handled or not (respectively):

https://github.com/App-vNext/Polly/blob/957444f8a3ce923b298e9dee1ec653447ec4091d/src/Polly.Core/Retry/RetryResilienceStrategy.cs#L60

https://github.com/App-vNext/Polly/blob/957444f8a3ce923b298e9dee1ec653447ec4091d/src/Polly.Core/Telemetry/TelemetryUtil.cs#L27

  • These are reported just before the next delay is calculated.
    • More precisely before the strategy decides whether it should have a new attempt or not.
    • So, this event does occur after the final attempt.
  • These events start with the Execution attempt. Source:... message

The desired state

So, what we could do is to change the ExecutionAttempt event's severity to Error if we run out of attempts. In other words, if the execution is cancelled or the final attempt's outcome is not handled then the severity will be either Warning or Information (respectively).

Basically changing this logic

https://github.com/App-vNext/Polly/blob/957444f8a3ce923b298e9dee1ec653447ec4091d/src/Polly.Core/Retry/RetryResilienceStrategy.cs#L60-L65

to something like this:

var isLastAttempt = IsLastAttempt(attempt, out bool incrementAttempts);
if (isLastAttempt && handle)
{
    TelemetryUtil.ReportFinalExecutionAttempt(_telemetry, context, outcome, attempt, executionTime, handle);
}
else
{
    TelemetryUtil.ReportExecutionAttempt(_telemetry, context, outcome, attempt, executionTime, handle);
}

if (context.CancellationToken.IsCancellationRequested || isLastAttempt || !handle)
{
    return outcome;
}

@sander1095 Is this what you are looking for?

peter-csala avatar Sep 16 '24 15:09 peter-csala

Sounds good, @peter-csala . I am pretty sure I did see resilience logged as error in non-error cases, but that might be related to Microsoft.Extensions.Http.Resilience and not Polly directly

sander1095 avatar Sep 16 '24 15:09 sander1095