fyrd icon indicating copy to clipboard operation
fyrd copied to clipboard

Add ability to set maximum jobs per queue

Open MikeDacre opened this issue 7 years ago • 2 comments

Currently max jobs are defined for the whole user. Implement a queue specific maximum as well as a the user specific maximum. This change will break the current API slightly.

Should implement this as a keyword argument that can be set in the profile: queue_max

Note: it may be a good idea to allow a json dictionary to be defined which sets queue attributes, something like:

{
  "default": {
    "max_jobs": 2000,
    "max_cores": 12,
    "max_walltime": "48:00:00",
    "max_mem": "32GB"
  },
  "long": {
    "max_jobs": 500,
    "max_cores": 24,
    "max_walltime": "96:00:00",
    "max_mem": "64GB"
  }
}

The format would be {partition/queue: { max_<keyword>: value}} and could be used to set maximums for any single partition.

Every part of this file would be optional. It makes sense for it to be json rather than regular config format.

MikeDacre avatar Aug 16 '17 22:08 MikeDacre

It seems like the info messages are incorrect when the max_jobs option is used. I have the following script (test.py)

#!/usr/bin/env python
import fyrd

cmd = 'sleep 180'
job1 = fyrd.Job(cmd,partition = "owners", outpath = 'out', scriptpath = 'sub')
job2 = fyrd.Job(cmd,partition = "owners", outpath = 'out', scriptpath = 'sub')
job3 = fyrd.Job(cmd,partition = "owners", outpath = 'out', scriptpath = 'sub')
job4 = fyrd.Job(cmd,partition = "owners", outpath = 'out', scriptpath = 'sub')
job5 = fyrd.Job(cmd,partition = "owners", outpath = 'out', scriptpath = 'sub')
job6 = fyrd.Job(cmd,partition = "owners", outpath = 'out', scriptpath = 'sub')

job1.submit(max_jobs=2)
job2.submit(max_jobs=2)
job3.submit(max_jobs=2)
job4.submit(max_jobs=2)
job5.submit(max_jobs=2)
job6.submit(max_jobs=2)

And it produces the following output

$ ./test.py 
20170816 15:40:29.530 | INFO --> The queue is full, there are 1 jobs running and 0 jobs queued. Will wait to submit, retrying every 1 seconds.
20170816 15:43:23.612 | INFO --> The queue is full, there are 1 jobs running and 1 jobs queued. Will wait to submit, retrying every 1 seconds.
20170816 15:43:42.145 | INFO --> The queue is full, there are 1 jobs running and 1 jobs queued. Will wait to submit, retrying every 1 seconds.
20170816 15:46:35.936 | INFO --> The queue is full, there are 1 jobs running and 1 jobs queued. Will wait to submit, retrying every 1 seconds.

However manual inspection of the queue shows that there are two jobs running at any given time.

I think it is a good idea to set this at the queue level and not the job level, otherwise I can effectively ignore the max_jobs argument with:

job1.submit(max_jobs=1)
job2.submit(max_jobs=2)
job3.submit(max_jobs=3)
job4.submit(max_jobs=4)
job5.submit(max_jobs=5)
job6.submit(max_jobs=6)

surh avatar Aug 16 '17 22:08 surh

That issue is because the time that the job object is caching job information is different from the queue, it is a little silly, I might try to fix it later. I am working on the change to set the minimum, but it will probably go into the 0.6.2 branch only, to be released with the 0.6.2.a1 release next week.

MikeDacre avatar Aug 18 '17 00:08 MikeDacre