TransformerSum icon indicating copy to clipboard operation
TransformerSum copied to clipboard

After extractive training, a process on one GPU won't terminate automatically.

Open PolarisRisingWar opened this issue 2 years ago • 0 comments

I've found this process was launched by this command: python -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=24, pipe_handle=317) --multiprocessing-fork The process of extractive training is over. I got the checkpoint which name has tmp_end. But this strange process still occupied one of my gpus and continued running without outputs. I have to kill it manually. I don't know what caused this problem?

PolarisRisingWar avatar Mar 02 '22 15:03 PolarisRisingWar