flux-core
flux-core copied to clipboard
t2492-shell-lost.sh: job gets SIGINT too early
Saw this one in CI. It appears that the SIGINT
is being sent to the job while one task is still importing modules.
I'm not sure how that happens since there's a barrier call in there, so something unexpected must be happening here.
2024-03-15T21:46:18.6648215Z t2492: Sending SIGINT to fgVXsPm. Job should now exit
2024-03-15T21:46:18.6648712Z 0.000s: job.submit {"userid":1001,"urgency":16,"flags":0,"version":1}
2024-03-15T21:46:18.6649136Z 0.012s: job.validate
2024-03-15T21:46:18.6649385Z 0.025s: job.depend
2024-03-15T21:46:18.6649638Z 0.025s: job.priority {"priority":16}
2024-03-15T21:46:18.6650190Z 0.031s: job.alloc {"annotations":{"sched":{"resource_summary":"rank[0-3]/core0"}}}
2024-03-15T21:46:18.6650663Z 0.061s: job.start
2024-03-15T21:46:18.6650985Z flux-job: task(s) exited with exit code 130
2024-03-15T21:46:18.6651334Z 0.482s: job.finish {"status":33280}
2024-03-15T21:46:18.6651630Z 0.032s: exec.init
2024-03-15T21:46:18.6651864Z 0.034s: exec.starting
2024-03-15T21:46:18.6652353Z 0.157s: exec.shell.init {"service":"1001-shell-fgVXsPm","leader-rank":0,"size":4}
2024-03-15T21:46:18.6652962Z 0.176s: exec.shell.start {"taskmap":{"version":1,"map":[[0,4,1,1]]}}
2024-03-15T21:46:18.6653808Z 0.474s: exec.shell.task-exit {"localid":0,"rank":1,"state":"Exited","pid":242340,"wait_status":2,"signaled":2,"exitcode":130}
2024-03-15T21:46:18.6654454Z 0.482s: exec.complete {"status":33280}
2024-03-15T21:46:18.6654766Z 0.482s: exec.done
2024-03-15T21:46:18.6655372Z 0.395s: flux-shell[0]: WARN: exception: exception.c:49: shell rank 3 (on fv-az1492-456): Killed
2024-03-15T21:46:18.6655932Z Traceback (most recent call last):
2024-03-15T21:46:18.6656472Z File "/tmp/flux-TKzLPm/jobtmp-0-fgVXsPm/critical.py", line 4, in <module>
2024-03-15T21:46:18.6656939Z import flux
2024-03-15T21:46:18.6657343Z File "/usr/src/src/bindings/python/flux/__init__.py", line 14, in <module>
2024-03-15T21:46:18.6657835Z import flux.core.handle
2024-03-15T21:46:18.6658312Z File "/usr/src/src/bindings/python/flux/core/handle.py", line 19, in <module>
2024-03-15T21:46:18.6658834Z from flux.future import Future
2024-03-15T21:46:18.6659301Z File "/usr/src/src/bindings/python/flux/future.py", line 16, in <module>
2024-03-15T21:46:18.6659865Z from flux.util import check_future_error, interruptible
2024-03-15T21:46:18.6660418Z File "/usr/src/src/bindings/python/flux/util.py", line 43, in <module>
2024-03-15T21:46:18.6660946Z from flux.utils.parsedatetime import Calendar
2024-03-15T21:46:18.6661587Z File "/usr/src/src/bindings/python/flux/utils/parsedatetime/__init__.py", line 69, in <module>
2024-03-15T21:46:18.6662280Z pdtLocales = dict([(x, load_locale(x)) for x in _locales])
2024-03-15T21:46:18.6662722Z ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-03-15T21:46:18.6663367Z File "/usr/src/src/bindings/python/flux/utils/parsedatetime/__init__.py", line 69, in <listcomp>
2024-03-15T21:46:18.6664058Z pdtLocales = dict([(x, load_locale(x)) for x in _locales])
2024-03-15T21:46:18.6664502Z ^^^^^^^^^^^^^^
2024-03-15T21:46:18.6665874Z File "/usr/src/src/bindings/python/flux/utils/parsedatetime/pdt_locales/__init__.py", line 28, in load_locale
2024-03-15T21:46:18.6667082Z mod = __import__(__name__, fromlist=[locale], level=0)
2024-03-15T21:46:18.6667755Z ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-03-15T21:46:18.6668306Z KeyboardInterrupt
2024-03-15T21:46:18.6668735Z Traceback (most recent call last):
2024-03-15T21:46:18.6669699Z File "/tmp/flux-TKzLPm/jobtmp-1-fgVXsPm/critical.py", line 4, in <module>
2024-03-15T21:46:18.6670485Z import flux
2024-03-15T21:46:18.6671167Z File "/usr/src/src/bindings/python/flux/__init__.py", line 14, in <module>
2024-03-15T21:46:18.6671994Z import flux.core.handle
2024-03-15T21:46:18.6672772Z File "/usr/src/src/bindings/python/flux/core/handle.py", line 19, in <module>
2024-03-15T21:46:18.6673451Z from flux.future import Future
2024-03-15T21:46:18.6673928Z File "/usr/src/src/bindings/python/flux/future.py", line 16, in <module>
2024-03-15T21:46:18.6674505Z from flux.util import check_future_error, interruptible
2024-03-15T21:46:18.6675064Z File "/usr/src/src/bindings/python/flux/util.py", line 43, in <module>
2024-03-15T21:46:18.6675583Z from flux.utils.parsedatetime import Calendar
2024-03-15T21:46:18.6676228Z File "/usr/src/src/bindings/python/flux/utils/parsedatetime/__init__.py", line 69, in <module>
2024-03-15T21:46:18.6676917Z pdtLocales = dict([(x, load_locale(x)) for x in _locales])
2024-03-15T21:46:18.6677351Z ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-03-15T21:46:18.6677981Z File "/usr/src/src/bindings/python/flux/utils/parsedatetime/__init__.py", line 69, in <listcomp>
2024-03-15T21:46:18.6678670Z pdtLocales = dict([(x, load_locale(x)) for x in _locales])
2024-03-15T21:46:18.6679077Z ^^^^^^^^^^^^^^
2024-03-15T21:46:18.6679760Z File "/usr/src/src/bindings/python/flux/utils/parsedatetime/pdt_locales/__init__.py", line 28, in load_locale
2024-03-15T21:46:18.6680506Z mod = __import__(__name__, fromlist=[locale], level=0)
2024-03-15T21:46:18.6680912Z ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-03-15T21:46:18.6681387Z File "<frozen importlib._bootstrap>", line 1178, in _find_and_load
2024-03-15T21:46:18.6681973Z File "<frozen importlib._bootstrap>", line 1149, in _find_and_load_unlocked
2024-03-15T21:46:18.6682548Z File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
2024-03-15T21:46:18.6683261Z File "<frozen importlib._bootstrap_external>", line 936, in exec_module
2024-03-15T21:46:18.6683849Z File "<frozen importlib._bootstrap_external>", line 1032, in get_code
2024-03-15T21:46:18.6684466Z File "<frozen importlib._bootstrap_external>", line 1131, in get_data
2024-03-15T21:46:18.6684889Z KeyboardInterrupt
2024-03-15T21:46:18.6685205Z t2492: Job exited with rc=130 (expecting 137 (128+9))
2024-03-15T21:46:18.6685596Z t2492: Unexpected job exit code 130
2024-03-15T21:46:18.6686289Z Mar 15 21:45:09.222754 broker.err[0]: rc2.0: /usr/src/t/issues/t2492-shell-lost.sh Exited (rc=1) 0.8s
2024-03-15T21:46:18.6686924Z flux-start: 0 (pid 235881) exited with rc=1
2024-03-15T21:46:18.6687583Z Mar 15 21:45:10.892181 broker.err[0]: rc2.0: /usr/src/t/issues/t2492-shell-lost.sh Exited (rc=1) 4.2s
2024-03-15T21:46:18.6688179Z not ok 32 - t2492-shell-lost