OpenCastKit icon indicating copy to clipboard operation
OpenCastKit copied to clipboard

本地运行失败,这个代码是否必须在幻方的集群里面运行?

Open jialiangZ opened this issue 1 year ago • 0 comments

$ python train_fourcastnet.py --pretrain-epochs 10 --fintune-epochs 4 --batch-size 1

报错:

非集群环境 非集群环境 Traceback (most recent call last): File "train_fourcastnet.py", line 207, in hfai.multiprocessing.spawn(main, args=( File "/home/pineapple/mambaforge/envs/OpenCast/lib/python3.8/site-packages/hfai/multiprocessing/spawn.py", line 66, in spawn return start_processes(fn, args, nprocs, join, daemon, start_method='spawn', bind_numa=bind_numa) File "/home/pineapple/mambaforge/envs/OpenCast/lib/python3.8/site-packages/hfai/multiprocessing/spawn.py", line 37, in start_processes while not context.join(): File "/home/pineapple/mambaforge/envs/OpenCast/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 130, in join raise ProcessExitedException( torch.multiprocessing.spawn.ProcessExitedException: process 0 terminated with signal SIGSEGV

jialiangZ avatar Oct 20 '23 11:10 jialiangZ