Why is the walltime is set by default?
I was wondering why the default walltime for workers is set to 30 minutes instead of None?
For example, see https://github.com/dask/dask-jobqueue/blob/1f9ae1ecc79a5b76930e56c8801f3b8ace659877/dask_jobqueue/jobqueue.yaml#L71
I kept having failed jobs for 2 days until I realized that it's actually this library is setting a hard limit for my jobs, why? This was very hard to debug among all the other errors that I was trying to fix.
If you want your users to be aware of walltime, isn't there a better solution than setting it to a random number?
Can't you make it a required variable?
Can't the default value be something like "not-set-by-user" and complain if it's not set?
Most Dask cluster managers intend to get folks up and running as quickly as possible without insisting on lots of things being configured by the user. Most (if not all) HPC systems require walltime to be configured, so setting a sensible default like 30 minutes is pretty reasonable.
I would also expect that most HPC systems show that a job was killed by it reaching the walltime.
I would be tempted to suggest that the right thing to do in this case is to clearly feedback to the user what has been created on their behalf.
We added an _log method to all Cluster objects a while ago which prints out messages unless a quiet=True kwarg is set and stores all logged messages so they can be viewed via cluster.get_logs().
https://github.com/dask/distributed/blob/ddec3c9c7b5a5d3cb3b806ff25ee36ef640d8dda/distributed/deploy/cluster.py#L194
Perhaps it would be worth using this message to communicate job configuration to users.
I was wondering why the default walltime for workers is set to 30 minutes instead of None?
Historically, I'd say this is because we wanted interactive use and job that can run fast and not stay in queue too long. Using a small walltime and limited resources by job often help with jobs being executed. I personally often use a walltime of 1 hour.
If you want your users to be aware of walltime, isn't there a better solution than setting it to a random number?
Maybe yes, in this case this was not to made them aware, but as @jacobtomlinson to have a sensible default.
Can't you make it a required variable?
I'd say no, in some batch scheduling system conf this is not required.
Can't the default value be something like "not-set-by-user" and complain if it's not set?
If we wanted to change, I would be in favor of None instead of a wrong value. But I find a default value of 30 minutes of 1hour perfectly fine. As @jacobtomlinson said,
most HPC systems show that a job was killed by it reaching the walltime.
and we also give some hints to debug: https://jobqueue.dask.org/en/latest/debug.html, first thing is checking job script which tells the walltime used.
Perhaps it would be worth using this message to communicate job configuration to users.
I'm not 100% sure, this could lead to a bit too much information (like printing the job_script).
@183amir I understand your frustration for spending 2 days on this, but this is the first time it is mentioned here, and I also think that when coding, this kind of stories do happen.
In the end, I'm more in favor of closing this issue, but I'm all hears if other users or maintainers of daks-jobqueue have something to say.
Most (if not all) HPC systems require walltime to be configured, so setting a sensible default like 30 minutes is pretty reasonable.
In our HPC, the queues have a maximum time set by default. There is no need to set the walltime. I imagine the admin users of other HPCs make sure that jobs don't run indefinitely. If an HPC system requires the walltime to be set AND the default walltime is None, the users will figure this out by the first time using this library.
I would also expect that most HPC systems show that a job was killed by it reaching the walltime.
Yes, but the whole idea of using dask is not to deal with and learn the syntax of the HPC system. This is what I got from SGE:
failed 37 : qmaster enforced h_rt, h_cpu, or h_vmem limit
exit_status 137 (Killed)
now what is h_rt? which of the h_rt, h_cpu, or h_vmem was enforced actually?
Using a small walltime and limited resources by job often help with jobs being executed.
How does that help? I don't understand. Do you mean that setting a small walltime would increase the priority of the submitted job somehow?
Do you mean that setting a small walltime would increase the priority of the submitted job somehow?
In my experience, shorter walltimes (e.g. 30 minutes) allow a job to start more quickly than if the maximum walltime is always requested.
In my experience, shorter walltimes (e.g. 30 minutes) allow a job to start more quickly than if the maximum walltime is always requested.
Would you know why? Could you please test that on your grid objectively or ask your grid admin if that's actually true? That does not make sense to me. Usually, the fewer requirements that you put on your job will allow your job to be matched with a wider selection of machines available in the grid.
In my experience, shorter walltimes (e.g. 30 minutes) allow a job to start more quickly than if the maximum walltime is always requested.
Would you know why? Could you please test that on your grid objectively or ask your grid admin if that's actually true? That does not make sense to me. Usually, the fewer requirements that you put on your job will allow your job to be matched with a wider selection of machines available in the grid.
I think you just need to consider how a queueing system works. It's a bit like tables in a restaurant, tables are reserved at particular times but sometimes people finish their meals early so the table is free until the next reservation arrives: if you willing to have a short time to eat you can fit into the empty slot and skip the queue.
If you want to understand more about job schedulers (queueing systems) you can take a look at https://carpentries-incubator.github.io/hpc-intro/13-scheduler/index.html
It is also worth considering that some job schedulers have fixed points in the future where a high priority job has been allocated. In my experience of running weather and climate models on HPC systems there are hourly, daily, monthly, etc jobs that are already scheduled into the system.
So if you can submit short jobs they may fit into gaps that long jobs will not.
Would you agree to a change similar to https://github.com/dask/dask-jobqueue/pull/501 ? Because right now it's not possible to set walltime as None without changing the configs.
So in https://github.com/dask/dask-jobqueue/pull/501#issuecomment-865959711 we converge towards an agreed solution.
@andersy005 @jacobtomlinson @lesteve does it sounds good to you to change the default walltime in the config file and set it to null?
This should be accompanied by a indication somewhere in the documentation saying that this will be required in most setup (this might break things for some people, but I expect most of out user base are aware of all this), and that for interactive workflows it should be set as a low value like one hour or below.
This should be accompanied by a indication somewhere in the documentation saying that this will be required in most setup
I am apprehensive about making a change like this to support a minority of users. If something is required by most users then it should have a sensible default, as we do today.
If something is required by most users then it should have a sensible default, as we do today.
I think there are two issues here:
- One person's sensible default is not necessarily the same as another's
- A sensible default should be well documented
On point 1, I don't think there's any magical number that can be chosen as the "most sensible" default, so keeping as is (to not interfere with existing users) seems reasonable. One point 2, I honestly didn't even know that there was a default walltime until this ticket was open. From what I can tell this isn't mentioned anywhere in the documentation (although there are a few references to changing the walltime) so I think the right course of action is to do so.
I'm definitely in favour of documenting it.
I am also in favour of logging output to the user when we create cluster managers for them. In dask-cloudprovider we log some information about the cluster object when it gets instantiated and that information is also available at cluster.get_logs(). We log things like VM type, docker image, region, etc. I think other cluster managers should start implementing this too.
OK, what you say makes complete sense!
So to fix this issues, we should both :
- update documentation to inform about the default setting (and explaining it a bit),
- log some output with the jobqueue cluster main settings by implementing the
_logmethod of the Cluster object.
@jacobtomlinson could you provide a bit more information on why this _log method has been implemented instead of just using logging package?
I see it is also indicated
For use in subclasses where initialisation may take a while and it would be beneficial to feed back to the user
So we should log both the job configuration/directives, and the events like job submission?