clusterflow icon indicating copy to clipboard operation
clusterflow copied to clipboard

Negative complete jobs in qstat output

Open s-andrews opened this issue 8 years ago • 13 comments
trafficstars

======================================================================
 Cluster Flow Pipeline: samtools_sort_index
 Submitted:             20 minutes, 5 seconds ago
 Working Directory:     /bi/group/bioinf/Rachael_Huntly/Cufflinks_Analysis/Rachel_0_vs_8_hour
 Cluster Flow ID:       samtools_sort_index_1485260139
 Submitted Jobs:        17
 Running Jobs:          8
 Queued Jobs:           11 (resources)
 Completed Jobs:        -2 (-11%)
======================================================================

 - samtools_sort_index                             [4 cores]
      - email_run_complete
      - email_run_complete

 - samtools_sort_index                             [4 cores]
      - email_run_complete

 - samtools_sort_index                             [4 cores]
      - email_run_complete

 - samtools_sort_index                             [4 cores]
      - email_run_complete

 - samtools_sort_index                             [4 cores]
      - email_run_complete

 - samtools_sort_index                             [4 cores]
      - email_run_complete

 - samtools_sort_index                             [4 cores]
      - email_run_complete
           - email_pipeline_complete

 - samtools_sort_index                             [4 cores]
      - email_run_complete
      - email_run_complete

s-andrews avatar Jan 24 '17 12:01 s-andrews

Is this always the case? Or only occasionally?

The code that does this parses how many jobs were submitted from the initial log file, then subtracts the number of running / pending jobs etc. I guess I could easily add a check that this number is ≥ 0 (and make it 0 if not), but it would be better to figure out why it's able to get a negative number..

Phil

ewels avatar Mar 01 '17 12:03 ewels

@s-andrews / @FelixKrueger - if one of you could send me the CF submission log for a run where this is happened I'll take a look. I think it must be a case that the number of jobs submitted aren't being counted properly.

ewels avatar Mar 03 '17 12:03 ewels

submission log:

Cluster Flow Pipeline: bismark_singlecell
Submitted:             7 minutes, 2 seconds ago
Working Directory:     /path/to/dir
Cluster Flow ID:       bismark_singlecell_1488545447
Submitted Jobs:        902
Running Jobs:          75
Queued Jobs:           1102 (resources)
Completed Jobs:        -275 (-30%)

Hmm, strange. I agree that it looks like there were 902 jobs submitted there. So it must be over-counting the queued jobs somehow.

Ok, next up - could you do a cf --qstat to get the above log followed by a normal qstat so that I can try to figure out why it thinks that there are so many pipeline jobs queued please..

ewels avatar Mar 03 '17 13:03 ewels

Also - I didn't actually explicitly say this myself, but it works fine for me 😁 That's why I'm asking you guys to do stuff.

Two more questions:

  1. Does it always do this, or only sometimes?
  2. Why are you running v0.4_dev? v0.4 is the latest released version and v0.5_dev is the most recent development version 😉

Phil

ewels avatar Mar 03 '17 13:03 ewels

Are you sure you want this? ^^ CF_qstat.txt qstat.txt

FelixKrueger avatar Mar 03 '17 13:03 FelixKrueger

Ah, no good - everything is fine in CF_qstat.txt, looks like the correct number of running and queued jobs, no negative Completed Jobs number..

ewels avatar Mar 03 '17 14:03 ewels

..spoke to soon, there are a lot of different pipeline runs in this file it would seem...!!!

ewels avatar Mar 03 '17 14:03 ewels

I see this:

 Cluster Flow ID:       bismark_singlecell_1488545447
 Submitted Jobs:        902
 Running Jobs:          77
 Queued Jobs:           1095 (resources)
 Completed Jobs:        -270 (-29%)
======================================================================

 - bismark_align                                   [4 cores]  [queued, priority 0]
      - bismark_deduplicate
           - bismark_methXtract
                - bismark_report

FelixKrueger avatar Mar 03 '17 14:03 FelixKrueger

yes sorry, it's not like I have nothing to do... :)

FelixKrueger avatar Mar 03 '17 14:03 FelixKrueger

Ah, I need longer qstat output though. The default trims the full job name, I forgot that. Can you instead do qstat -pri -r -xml please?

ewels avatar Mar 03 '17 14:03 ewels

Here you go: qstat.txt

FelixKrueger avatar Mar 03 '17 14:03 FelixKrueger

Yay, 75114 lines of xml for me to read through. Such a lucky boy! 🥇

ewels avatar Mar 03 '17 14:03 ewels