nextflow icon indicating copy to clipboard operation
nextflow copied to clipboard

Azure Batch: Add disk size to slots calculation

Open adamrtalbot opened this issue 2 months ago • 7 comments

New feature

When using Azure Batch, Nextflow will reject a process if it has too many CPUs for the worker machine.

Caused by:
  Process requirement exceeds available CPUs -- req: 32; avail: 10

However, Azure Batch VMs come with a fixed disk and it's common that the Nextflow process runs out of storage. There are many, many issues about this on the Nextflow Slack! The typical workaround is to increase the number of CPUs an individual process requires, however it would be better to support the disk directive so we can directly enforce the VMs have the right sized disk.

Although we can't enforce it properly (i.e. make sure tasks are only assigned to a VM with enough space), being able to prevent users trying to run a task on a machine which is too small would catch some of the issues.

Usage scenario

When running on Azure Batch, raise an error if a task is assigned to a queue which does not contain sufficient storage.

Suggest implementation

process HELLO {
    disk 12.TB

    """
    echo Hello
    """
}

workflow {
    HELLLO()
}
Caused by:
  Process requirement exceeds available storage -- req: 10TB; avail: 1TB

adamrtalbot avatar Apr 16 '24 09:04 adamrtalbot