HigherHRNet-Human-Pose-Estimation icon indicating copy to clipboard operation
HigherHRNet-Human-Pose-Estimation copied to clipboard

Exception: process 1 terminated with signal SIGKILL

Open mdk19015 opened this issue 5 years ago • 0 comments

I was training using w32_512....yaml file and this error showed up. can anyone help me understand why this error occurred and how to solve this? I ran valid.py using the saved model from checkpoint and it ran without showing any error.

Error:

Epoch: [19][300/7922] Time: 0.630s (0.682s) Speed: 12.7 samples/s Data: 0.000s (0.173s) Stage0-heatmaps: 1.333e-04 (2.700e-04) Stage1-heatmaps: 5.821e-05 (1.006e-04) Stage0-push: 1.250e-04 (2.166e-04) Stage1-push: 0.000e+00 (0.000e+00) Stage0-pull: 5.478e-10 (1.137e-09) Stage1-pull: 0.000e+00 (0.000e+00) Traceback (most recent call last): File "tools/dist_train.py", line 425, in main() File "tools/dist_train.py", line 114, in main args=(ngpus_per_node, args, final_output_dir, tb_log_dir) File "/home/hiperdyne/.local/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 171, in spawn while not spawn_context.join(): File "/home/hiperdyne/.local/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 107, in join (error_index, name) Exception: process 1 terminated with signal SIGKILL /usr/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 20 leaked semaphores to clean up at shutdown len(cache)) /usr/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 20 leaked semaphores to clean up at shutdown len(cache))

mdk19015 avatar Jun 18 '20 04:06 mdk19015