Investigating Task Overload Issues in Azure Batch with Automatic Scaling(maxParallelTasks)
I have a strange issue with my Azure Batch setup. Here's the situation: I have a pipeline that creates about 30 jobs, and each job contains roughly 200 tasks. Normally, the job is configured to have a maximum of two parallel tasks running at the same timemax( at job level ParallelTasks=2). However, I’ve noticed that sometimes one of the jobs runs 15-20 tasks simultaneously.
However, I don’t understand why all my jobs generally work fine, with only two tasks running at the same time as expected. But occasionally, and even quite often, I observe many tasks running simultaneously in a job, despite the limitation being set to two. Honestly, I don’t understand this behavior.
When I increase the number of VMs in the pool ( in my tests), the problem becomes significantly worse. Sometimes, I observe 40, 50, or even 60 tasks running simultaneously. This creates major issues in production because it consumes a lot of resources, drastically slows everything down, and makes the system very inefficient.
I don’t understand this behavior. I’ll send you the automatic scaling configuration I’ve written. Maybe the error lies there.
$varActiveTasks = max($ActiveTasks.GetSample(15));
$varRunningTasks = max($RunningTasks.GetSample(15));
$varPendingTasks = max($PendingTasks.GetSample(15));
$varTaskSlotsPerNode = max($TaskSlotsPerNode);
$varCurrentDedicatedNodes = $CurrentDedicatedNodes;
$varUsableNodeCount = max($UsableNodeCount.GetSample(15));
$varPreemptedNodeCount = max($PreemptedNodeCount.GetSample(15));
$varTargetDedicatedNodes = $TargetDedicatedNodes;
$varMaxTargetDedicatedNodes = 10;
$CurTime = time() + (2 * TimeInterval_Hour);
$CurHour = $CurTime.hour;
$CurMinute = $CurTime.minute;
$IsNightTime = ($CurHour == 21 && $CurMinute >= 50) || ($CurHour >= 22) || ($CurHour < 3);
$minCapacity = $IsNightTime ? 0 : 0;
$samples = $PendingTasks.GetSamplePercent(TimeInterval_Minute * 15);
$LastSampledActiveTasks = $samples < 70 ? max(0, $PendingTasks.GetSample(1)) : max($PendingTasks.GetSample(1), avg($PendingTasks.GetSample(TimeInterval_Minute * 15)));
$smoothedPendingTasks = avg($PendingTasks.GetSample(TimeInterval_Minute * 15));
$requiredNodes = ceil($smoothedPendingTasks / $varTaskSlotsPerNode);
$adjustmentFactor = 0.5;
$recommendedNode = $TargetDedicatedNodes + ($requiredNodes - $TargetDedicatedNodes) * $adjustmentFactor;
$TargetDedicatedNodes = min(max($recommendedNode, $minCapacity), $varMaxTargetDedicatedNodes);
$NodeDeallocationOption = taskcompletion;
How to limit the number of tasks in all jobs? thanks in advance for your help,
MaxParallelTasks is a setting that is only observed if the Batch account has that feature enabled. Is your account enabled?