stubl
stubl copied to clipboard
sqstat logic inconsistent
When calculating what is currently in use, sqstat is incorrectly listing the number of running jobs and total number of jobs. However, the current core usage and current node usage numbers are correct. Example:
Partition : Summary of current jobs
======================================================
part1 : 1 jobs ( 0 running , 1 queued )
Partition : Summary of current core usage
===============================================================
part1 : 56 cores ( 29 in use, 27 idle, 0 other )
Partition : Summary of current node usage
===================================================
part1 : 1 nodes ( 1 in use, 0 idle/down )
Slurm reports:
$ squeue -M faculty -p part1
CLUSTER: faculty
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
13959167 part1 Analysis user1 PD 0:00 1 (Resources)
13948810 part1 EMMA mod user2 R 3-01:49:43 1 cpn-m24-13