Batch
Batch copied to clipboard
Support multi-instance tasks on concurrency enabled pools
It's possible that I am not configuring my tasks correctly, because this seems like a feature that would already exist. But, I am having trouble getting RequiredSlots and multi-instance tasks to work.
When I create a batch job it contains thousands of tasks. I would say that 75% of the tasks created are small, and a single node could run 4 tasks concurrently. The remaining 25% are tasks that require many cores with MPI.
I had MPI working, but now need to implement the RequiredSlots feature in order to optimize our node usage.
I've refactored the small tasks to use RequiredSlots, and that seems to be working now. However, the MPI task that requires multiple nodes no longer executes.
The documentation states that the requiredSlots property be set to 1 if it's a multi-instance task. I have tried setting this to 1, with the multi-instance property requiring 2 nodes, but the task never executes. I've also tried setting requiredSlots to the pool's TaskSlotsPerNode property (just to check), but of course received a validation exception stating the property value was invalid.
Are there any known issues with running a workflow that consists of tasks that consume just 1 slot (with each node having 4 slots), and other tasks that require multiple nodes with MPI?
Currently, concurrency on the pool must be disabled in order to execute multi-instance tasks, i.e., taskSlotsPerNode must be set to 1. Please examine using multiple pools to accomplish your goal.
Thank you for passing along that URL. I must have missed it because I was searching for "required slots" documentation since I already had MPI working.
I would suggest that this documentation be updated, as it implies you can make both work: https://docs.microsoft.com/en-us/rest/api/batchservice/task/add (near the requiredSlots property, "For multi-instance Tasks, this must be 1.")
In regards to the ability to run concurrent tasks and also multi-instance tasks, I hope that this is something the Azure Batch team considers. Multiple pools is not a viable solution, since in most cases, there is a workflow built into the job itself with a very specific task dependency hierarchy. By adding another pool, we'd lose the ability to have task dependencies. I know that task scheduling would be a little more difficult to implement, since an MPI task would likely require the entire node, but I don't see why this is an issue as long as the end user sets the task scheduling policy to ComputeNodeFillType.Pack. This would ensure there are open nodes for multi-instance tasks.
Hello! Just curious if there are any updates on this feature request? This feature would definitely cut costs by allowing us to be more efficient with VM resources.
I am just curious, has there been any progress toward resolving this issue? Any updates would be appreciated! Thanks!
Hello, I am sorry to bother, but is this something that ever made it to the Batch backlog?
The above item is still in Backlog. Do you have a work item created yet?
Thank you for checking. I do not have a work item, what do I need to do to create one?
Hello @prkannap, just curious, has there been any movement on this item?