submitit icon indicating copy to clipboard operation
submitit copied to clipboard

[enhancement] Time info, like time taken, within Job objects

Open mennowitteveen opened this issue 2 years ago • 2 comments

I have been using submitit for a while now and like it a bunch : ) so thanks for making it!

A nice new feature might be to get some extra info (like time taken for execution of a job) from the get_info() method.

Perhaps something like this already somewhere else in the code? Also if someone could give me some pointer of how to go about it, I would be happy to give it a try myself : )

mennowitteveen avatar Apr 14 '22 15:04 mennowitteveen

So this information is available in sacct.

You can look at how we are querying sacct: https://github.com/facebookincubator/submitit/blob/1253faa603f024151fa3b900c08fe39ac9671da1/submitit/slurm/slurm.py#L50-L50

For now we only ask the status and the list of nodes, but we could ask more information. sacct has a lot of different fields, and I don't know if we should query all of them every time we want the status.

List of fields: https://slurm.schedmd.com/sacct.html#OPT_helpformat

As a quick hack just run subprocess.check_output(["sacct", "-j", job.job_id, "Elapsed"]). But it would be better to add a parameter to get_info to get more columns, so that you get batching and catching for free.

Do you think you could contribute that ?

gwenzek avatar May 03 '22 09:05 gwenzek

+1. We are using submitit and this would be a good feature to add

cattabiani avatar Apr 11 '23 09:04 cattabiani