lightning-hydra-template icon indicating copy to clipboard operation
lightning-hydra-template copied to clipboard

cannot use "trainer=ddp"

Open a4152684 opened this issue 1 year ago • 0 comments

I can use "trainer=gpu" and it can work well but when I change it to "trainer=ddp", then it can't work Could you please help me? Traceback (most recent call last): File "/home/lcbryant/cz_nerf/p_nerf/src/utils/utils.py", line 38, in wrap metric_dict, object_dict = task_func(cfg=cfg) File "/home/lcbryant/cz_nerf/p_nerf/train.py", line 78, in train trainer.fit(model=model, datamodule=datamodule, ckpt_path=cfg.get("ckpt_path")) File "/home/lcbryant/anaconda3/envs/cz_nerf/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 582, in fit call._call_and_handle_interrupt( File "/home/lcbryant/anaconda3/envs/cz_nerf/lib/python3.9/site-packages/pytorch_lightning/trainer/call.py", line 36, in _call_and_handle_interrupt return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs) File "/home/lcbryant/anaconda3/envs/cz_nerf/lib/python3.9/site-packages/pytorch_lightning/strategies/launchers/multiprocessing.py", line 113, in launch mp.start_processes( File "/home/lcbryant/anaconda3/envs/cz_nerf/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 189, in start_processes process.start() File "/home/lcbryant/anaconda3/envs/cz_nerf/lib/python3.9/multiprocessing/process.py", line 121, in start self._popen = self._Popen(self) File "/home/lcbryant/anaconda3/envs/cz_nerf/lib/python3.9/multiprocessing/context.py", line 284, in _Popen return Popen(process_obj) File "/home/lcbryant/anaconda3/envs/cz_nerf/lib/python3.9/multiprocessing/popen_spawn_posix.py", line 32, in init super().init(process_obj) File "/home/lcbryant/anaconda3/envs/cz_nerf/lib/python3.9/multiprocessing/popen_fork.py", line 19, in init self._launch(process_obj) File "/home/lcbryant/anaconda3/envs/cz_nerf/lib/python3.9/multiprocessing/popen_spawn_posix.py", line 47, in _launch reduction.dump(process_obj, fp) File "/home/lcbryant/anaconda3/envs/cz_nerf/lib/python3.9/multiprocessing/reduction.py", line 60, in dump ForkingPickler(file, protocol).dump(obj) AttributeError: Can't pickle local object 'get_embedder..'

a4152684 avatar Apr 17 '23 06:04 a4152684