signac-flow icon indicating copy to clipboard operation
signac-flow copied to clipboard

`auto` partition.

Open joaander opened this issue 2 years ago • 5 comments

Feature description

I would find it more convenient to use flow if the partition were automatically selected based on the job resource request. Many clusters have separate CPU and GPU partitions, or separate shared and whole node partitions. In a workflow with mixed CPU/GPU jobs (and/or jobs of different sizes), the user must manually run (e.g.):

project.py submit -o .*gpu' --partition=gpu
project.py submit -o .*small' --partition=shared
project.py submit -o .*large' --partition=wholenode

Some operations may auto-scale depending on the number of jobs left to execute. Until the user runs the submission command, they don't know whether shared or wholenode is the appropriate partition.

Proposed solution

The user should be able to make one submission:

project.py submit --partition=auto

Additional context

auto would select from one of the "standard" partitions (e.g. not the debug or high memory partitions) based on the job request:

  • If GPUs are requested, choose the gpu partition.
  • If more than one node is requested, choose the wholenode partition.
  • If less than one node is requested, choose the shared partition.

Any partition will remain settable explicitly on request.

joaander avatar Oct 30 '23 15:10 joaander

This should be an easy feature. I would support its addition. We need the appropriate underscored attributes in the environment classes where we set it to None by default. Perhaps something like

_default_partitions = {"gpu-shared": "gpu",
                       "cpu-shared": "shared",
                       "cpu": "standard",
                       "gpu": "gpu"}

b-butler avatar Oct 30 '23 17:10 b-butler

Yes, with that it may be possible to implement the auto selection in the base class this.

joaander avatar Oct 30 '23 22:10 joaander

Some systems use separate accounts for CPU and GPU: #703. These would not be able to use the auto partition.

joaander avatar Oct 31 '23 12:10 joaander

Some systems use separate accounts for CPU and GPU: #703. These would not be able to use the auto partition.

Could we make that a config option, where users can set a default account and a GPU account?

tcmoore3 avatar Oct 31 '23 12:10 tcmoore3

@tcmoore3 theoretically yes, but then I wonder if we are getting too niche with that. I would rather something more future proof or less logic on our side like an account argument to an operation decorator or perhaps as a decorator (like the second less as it is not really a resource). We could likewise specify a partition to make two more keyword arguments.

b-butler avatar Oct 31 '23 14:10 b-butler