nextflow icon indicating copy to clipboard operation
nextflow copied to clipboard

Let azure batch checks optionaly be ignored

Open fabien-nisol opened this issue 4 months ago • 4 comments

New feature

There is already a feature in the azure batch plugin that will allow node pool allocation failures to be retried.

This works if a failure, e.g. a quota being breached, is happening while a pipeline is running and it successfully validated during setup.

The problem is that if the node pool is already failed when the pipeline is starting, the "checkPool" method used during initialization of the batch service will make any starting pipeline fail right away, bypassing the retry policy

Usage scenario

In our case, we want the retryPolicy to prevail in such cases. Our retry policy is set up to allow this error to be retried, because in theory the situation should resolve by itself while the already running pipeline release pressure on the node pool

Suggest implementation

Remove the resize error check from the checkPool method, or make it de-activable by configuration. This should allow the pipeline to go start, at which point we'll catch the resize error and it will be retried.

fabien-nisol avatar Mar 04 '24 12:03 fabien-nisol