MONAILabel icon indicating copy to clipboard operation
MONAILabel copied to clipboard

Resource Scheduler API

Open aihsani opened this issue 4 years ago • 0 comments

Is your feature request related to a problem? Please describe. Currently MONAI Label Server assumes it has virtually unlimited space when running training, batch inference, and on-demand inference tasks (caused by user clicks). In cases where all or some of the above tasks do not fit in the system, the most recent task will fail with an out of memory error and not run, causing the user to manually guess if the system is able to run the task in the future when resources are freed up.

Describe the solution you'd like It is desirable to have an API that allows both MONAI label server developers and MONAI Label App developers to priritize and schedule a task when one is required to run. Due to the nature of the training, batch-inference, and on-demand inference tasks it is required that preemption be implemented as, for instance, an on-demand inference always takes precedence over any other currently running tasks. Saving process state when preempting is out of the scope of this feature request.

Describe alternatives you've considered We are considering PyDCGM as the library that will enable the resource scheduler to keep track of the running tasks, their memory utilization in the GPU, GPU attachment, etc.

aihsani avatar Oct 07 '21 13:10 aihsani