Pix2NeRF icon indicating copy to clipboard operation
Pix2NeRF copied to clipboard

RuntimeError: open(/process_group_sync.lock): Permission denied

Open bullshit123123 opened this issue 2 years ago • 1 comments

First of all, thank you for your work! When I used code to train my data, such problems occurred, which could not be solved. I hope you can make valuable suggestions. Specific errors are as follows: CUDA_VISIBLE_DEVICES=2 python train_con.py --curriculum=carla --output_dir=PATH_TO_OUTPUT --dataset_dir=dataset/exp0/train/.png --encoder_type='CCS' --recon_lambda=5 --ssim_lambda=1 --vgg_lambda=1 --pos_lambda_gen=15 --lambda_e_latent=1 --lambda_e_pos=1 --cond_lambda=1 --load_encoder=1 Namespace(cond_lambda=1.0, curriculum='carla', dataset_dir='dataset/exp0/train/.png', ema=1, encoder_type='CCS', eval_freq=5000, lambda_e_latent=1.0, lambda_e_pos=1.0, load_dir='', load_encoder=1, model_save_interval=200, n_epochs=3000, output_dir='PATH_TO_OUTPUT', port='12354', pos_lambda_gen=15.0, pretrained_dir='', recon_lambda=5.0, sample_interval=1000, set_step=None, sn=0, ssim_lambda=1.0, vgg_lambda=1.0, wandb_entity='', wandb_name='', wandb_project='') Lock not found terminate called after throwing an instance of 'std::system_error' what(): open(/process_group_sync.lock): Permission denied Traceback (most recent call last): File "train_con.py", line 686, in mp.spawn(train, args=(num_gpus, opt), nprocs=num_gpus, join=True) File "/home/nn/anaconda3/envs/nerf/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 199, in spawn return start_processes(fn, args, nprocs, join, daemon, start_method='spawn') File "/home/nn/anaconda3/envs/nerf/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 157, in start_processes while not context.join(): File "/home/nn/anaconda3/envs/nerf/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 118, in join raise Exception(msg) Exception:

-- Process 0 terminated with the following error: Traceback (most recent call last): File "/home/nn/anaconda3/envs/nerf/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap fn(i, *args) File "/home/nn/code/Pix2NeRF/train_con.py", line 87, in train setup(rank, world_size, opt.port, opt.output_dir) File "/home/nn/code/Pix2NeRF/train_con.py", line 48, in setup dist.init_process_group('gloo', init_method=file_lock, rank=rank, world_size=world_size) File "/home/nn/anaconda3/envs/nerf/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 433, in init_process_group timeout=timeout) File "/home/nn/anaconda3/envs/nerf/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 508, in _new_process_group_helper timeout=timeout) RuntimeError: open(/process_group_sync.lock): Permission denied

bullshit123123 avatar Feb 17 '23 14:02 bullshit123123