azure-webjobs-sdk icon indicating copy to clipboard operation
azure-webjobs-sdk copied to clipboard

Confirm some test cases for retry policy

Open jeffhollan opened this issue 3 years ago • 4 comments

Per early feedback and questions around retry policy, want to test a few scenarios to confirm:

Scale controller behavior

  • If you have a queue triggered function and retry it 100 times with a 5 minute delay between executions, will the execution logs or locked queue message prevent it from scaling to 0?
  • If you have a queue triggered function and retry it 100 times with a 30 minute delay between executions, will the execution logs or locked queue message prevent if rom scaling to 0?
  • Same thing with event hubs for both of the above

Queue retry behavior

  • If you have a 10 minute delay in a service bus message, and a lock renew period of 30 seconds (I believe the default), will we continue to renew the lock during the delay?

Function timeout

  • DotNet functions running in host process:

    • On function invocation timeout, host instance is shutdown. Retry count is not persisted and invocation will start on a new functions host instance with attempt 0
  • Non DotNet functions running in a language worker process

    • On function invocation timeout, host tries to restart language worker process. Retry-count is persisted as function invocations are handled by the same functions host instance.

Drain Mode

If Drain api is called while function executions are waiting on retries, retries will continue as long as function host instance is not teared down. Any trigger listeners will be gracefully shutdown and new invocations will not be happening on this function host instance.

jeffhollan avatar Nov 04 '20 17:11 jeffhollan

Tested and added notes for FunctionTimeout and Drain Mode section in the description.

pragnagopa avatar Nov 05 '20 00:11 pragnagopa

Test : Consumption plan + Azure Storage Queue Trigger + Infinite Retries

By default Azure Storage Queue Trigger retries 5 times i.e. dequeueCount < 5 . If Message dequeue count exceeds 5, message is moved to poison queue.

If host retry is configured, on a single host instance message will be retried without incrementing dequeueCount. If the host instance goes down, then message will be picked up by a new host instance and dequeueCount will be incremented.

If over a period of time, message is processed by 5 different host instances, then message will be moved to poison queue as dequeueCount for the message is now 5.

Test files used:

host.json

{
  "version": "2.0",
  "retry": {
    "strategy": "fixedDelay",
    "maxRetryCount": -1,
    "delayInterval":"00:00:30"
  },
  "extensionBundle": {
    "id": "Microsoft.Azure.Functions.ExtensionBundle",
    "version": "[1.*, 2.0.0)"
  }
}

index.js

module.exports = async function (context, myQueueItem) {
    context.log('JavaScript queue trigger function processed work item', myQueueItem);
    throw new Error('An error occurred');
};

pragnagopa avatar Nov 05 '20 20:11 pragnagopa

@alrod @AnatoliB is this still relevant? Are you tracking this elsewhere?

fabiocav avatar Jun 01 '22 20:06 fabiocav

  • @pragnagopa

fabiocav avatar Jun 01 '22 20:06 fabiocav