flux-core
flux-core copied to clipboard
job-list: support queue specific stats
per #4604, get queue specific job stats. in flux-jobs, the --queue option can now also select which queue --stats outputs.
re-pushed, fixed up && chain in tests, valgrind memleak find, and python linting
so something's a little off there.
hmmm. maybe a bug related to restarting of a flux instance, which I may have not added a test for. Thanks!
Edit: no not an errant bug ... I completely missed it!! Need to do it and cover with a test
re-pushed, fixing up the issues @garlick found above
fixed queue specific stats on reload of the job-list module and added some tests & extra coverage for it
However, I was not able to figure out how
0 running, -1 completed, 1 failed, 0 pending
occurred though. It's perplexing. Basically in the python code the number of completed jobs is:
self.successful = self.inactive - self.failed
suggesting inactive job count was < the failed job count to get the -1 output. Which I can't see a way in which that could happen.
My only guess is it could have been a counting issue with a job that was active when the job-list module was reloaded and one of the counters may have been under-counted. But I don't think that's what happened here.
My only guess is it could have been a counting issue with a job that was active when the job-list module was reloaded and one of the counters may have been under-counted. But I don't think that's what happened here.
This might be unrelated to this PR. Watch this:
garlick@picl0:~$ flux jobs --stats-only
0 running, 87 completed, 13 failed, 0 pending
garlick@picl0:~$ flux mini run -q debug /bin/false
flux-job: task(s) exited with exit code 1
garlick@picl0:~$ flux jobs --stats-only
0 running, 86 completed, 14 failed, 0 pending
Somehow the completed count dropped. This is on master right after #4687 (emoji) was merged.
Edit: also reproduces on current master:
garlick@picl0:~$ flux version
commands: 0.44.0-113-ge549e5afd
libflux-core: 0.44.0-113-ge549e5afd
libflux-security: 0.8.0-2-g8da4e73
build-options: +systemd+hwloc==2.4.0+zmq==4.3.4
garlick@picl0:~$ flux jobs --stats-only
0 running, 86 completed, 14 failed, 0 pending
garlick@picl0:~$ flux mini run -q debug /bin/false
flux-job: task(s) exited with exit code 1
garlick@picl0:~$ flux jobs --stats-only
0 running, 85 completed, 15 failed, 0 pending
Codecov Report
Merging #4684 (ad1193a) into master (cb4d814) will decrease coverage by
0.02%. The diff coverage isn/a.
:exclamation: Current head ad1193a differs from pull request most recent head b0a6919. Consider uploading reports for the commit b0a6919 to get more accurate results
@@ Coverage Diff @@
## master #4684 +/- ##
==========================================
- Coverage 83.37% 83.35% -0.03%
==========================================
Files 413 413
Lines 69781 69664 -117
==========================================
- Hits 58182 58070 -112
+ Misses 11599 11594 -5
| Impacted Files | Coverage Δ | |
|---|---|---|
| src/common/libutil/ipaddr.c | 64.70% <0.00%> (-4.82%) |
:arrow_down: |
| src/bindings/python/flux/uri/resolvers/lsf.py | 87.50% <0.00%> (-2.09%) |
:arrow_down: |
| src/cmd/flux-jobs.py | 95.52% <0.00%> (-1.46%) |
:arrow_down: |
| src/modules/job-info/guest_watch.c | 76.75% <0.00%> (-0.55%) |
:arrow_down: |
| src/common/libsubprocess/server.c | 60.54% <0.00%> (-0.55%) |
:arrow_down: |
| src/broker/overlay.c | 85.58% <0.00%> (-0.41%) |
:arrow_down: |
| src/bindings/python/flux/util.py | 94.40% <0.00%> (-0.39%) |
:arrow_down: |
| src/modules/job-manager/submit.c | 81.37% <0.00%> (-0.19%) |
:arrow_down: |
| src/bindings/python/flux/resource/Rlist.py | 94.68% <0.00%> (-0.17%) |
:arrow_down: |
| src/bindings/python/flux/job/Jobspec.py | 83.98% <0.00%> (-0.07%) |
:arrow_down: |
| ... and 10 more |