Batch icon indicating copy to clipboard operation
Batch copied to clipboard

Batch Support for Ubuntu 20.04 LTS Ending

Open alfpark opened this issue 1 year ago • 5 comments
trafficstars

Summary

Azure Batch will retire support for Ubuntu 20.04 aligned operating systems to follow the publisher ending standard support on April 23, 2025. If your workload utilizes Batch pools based on this OS, either via Marketplace images (directly from Canonical/microsoft-azure-batch/microsoft-dsvm or derived from by other publishers) or your own custom images including compute image gallery, then you will need to take action. Please migrate your workload to a publisher supported version of Ubuntu or derivative VM image. After April 23, 2025, Azure Batch will remove support for the batch.node.ubuntu 20.04 compute node agent. Afterwards, create pool and resize up operations for associated pools may fail. Existing pools may be subject to forced scale in to zero nodes at any point after this date. Customers who continue to use these pools past the indicated Batch support End of Support will be subject to potential security risks.

Migration possibilities

Please note that any Batch pool specifying the batch.node.ubuntu 20.04 node agent is subject to end of life.

  • If you are utilizing Canonical produced images, Canonical has newer available Ubuntu versions such as 24.04 and 22.04 LTS options available.
  • If you are utilizing microsoft-azure-batch or microsoft-dsvm published images for either container or HPC functionality, you can migrate to microsoft-dsvm ubuntu-hpc 2204 as an alternative. The 2404 sku is not yet available for this offer.
    • If you elect to use NCv3 Family VM sizes, you will need to do one of the following to enable the GPU:
      • Add the NVIDIA GPU Driver Extension as part of the Batch Pool VM extension definition. If using this extension, due to VM extension timing issues, compute nodes will appear idle before the GPU driver has finished installing. You will need to create a start task that blocks on the successful return of nvidia-smi on the host with waitForSuccess=true. Otherwise, the Batch scheduler will begin scheduling tasks that may fail if the GPU is required.
      • Create a custom image that contains the NVIDIA Proprietary (Closed Source) driver.
    • Note that the NCv3 VM Family is being retired on September 30, 2025. We recommend that you migrate your workload to an amenable VM size that does not have an imminent end of support date.

Please consult the List Supported Images API for the latest versions available.

More information

Azure Batch VM Size and Image Guide Azure Batch Best Practices

alfpark avatar Jun 26 '24 18:06 alfpark

So If we are using microsoft-azure-batch, still there is no option to upgrade latest image ?

Publisher microsoft-azure-batch Offer ubuntu-server-container

TharakaSl avatar Mar 06 '25 13:03 TharakaSl

"Migration Possibilities" outlines your choices. What is the particular problem you are facing?

alfpark avatar Mar 06 '25 19:03 alfpark

i am unable to get the canonical 24.04 image to load in a batch setting. i am using a configuration with auto-scale and a start task that were functional with the microsoft-azure-batch 20.04 image.

it appears microsoft has deprecated the microsoft-azure-batch 20.04 image with no viable replacement image available for azure data factory applications.

i think our move will be to migrate away from the azure data factory environments for our processes.

nrgpy avatar Apr 10 '25 18:04 nrgpy

i am unable to get the canonical 24.04 image to load in a batch setting. i am using a configuration with auto-scale and a start task that were functional with the microsoft-azure-batch 20.04 image.

it appears microsoft has deprecated the microsoft-azure-batch 20.04 image with no viable replacement image available for azure data factory applications.

i think our move will be to migrate away from the azure data factory environments for our processes.

What specifically is not working?

alfpark avatar Apr 17 '25 19:04 alfpark

Guidance has been updated for the special case of NCv3 VM family sizes.

alfpark avatar May 01 '25 17:05 alfpark

Hello,

We are facing an issue regarding updating our current Ubuntu 20.04 images. Here's a brief rundown from my side:

We run jobs running on Docker containers, on Ubuntu 20.04 VM's, on Azure Batch. These jobs also fetch resources from Azure Cloud Storage and Azure Message Queues, for the data that these jobs depend on.

The VM we need, requires it to be "DockerCompatible" as part of its capabilities; capabilities here refers to the one of the properties for a VM, when I list VM's available on my Batch account, the only VM's we are seeing that have this option and are not at the end of life stage, are Linux based HPC VM's. A few weeks ago, we tried using the Ubuntu 22.04 HPC image, published by microsoft-dsvm. All our jobs that were using that VM just crashed after starting, to mitigate this we rollbacked to using the Ubuntu 20.04 VM.

Talking to a consultant, this is what they said as to why the HPC image didn't work out for our usecase:

"UbuntuHPC images are tuned for MPI/HPC workloads (InfiniBand, heavy-compute), not for container-based jobs. If your tasks rely on Docker or Singularity, the HPC SKU may omit or misconfigure the default container runtime".

With that in mind, it seems HPC images seem to be a no-go for us. I am also curious what the "RDMAOnly" value stands for in the HPC image's capabilities? Does that affect receiving any resources on cloud storage over a network?

Looking to see what else we can do once the current Ubuntu 20.04 image is unusable.

if-constexpr avatar Jun 13 '25 17:06 if-constexpr

Another question which I have is that regarding the expiry of the Ubuntu 20.04 image. The image's end of life is listed as Apr 22, 2025 and yet we are still able to use it on our cloud compute. The only message I get from Azure Batch , when viewing a pool, is "this image is past its end of life on April 22, 2025. Please recreate this pool using a different image afterwards".

I am looking for a precise end of life/use date for this image, so that I have a clear cut date for when we will absolutely not be able to use this image for running any jobs on Azure Batch?

Details of the current image we are using:

publisher: microsoft-azure-batch offer: ubuntu-server-container sku: 20-04-lts

if-constexpr avatar Jun 13 '25 17:06 if-constexpr