nerf-factory
nerf-factory copied to clipboard
Error when running. Possible version error with pytorch_lightning
Hi,
I am having issues running this project. I suspect it's an issue with the version of pytorch_lightning.
Here's the output:
> python3 -m run --ginc configs/nerf/blender.gin
Traceback (most recent call last):
File "!/anaconda3/envs/nerf_factory/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "~/anaconda3/envs/nerf_factory/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "~/nerf-factory/run.py", line 23, in <module>
from pytorch_lightning.plugins import DDPPlugin
ImportError: cannot import name 'DDPPlugin' from 'pytorch_lightning.plugins' (~/anaconda3/envs/nerf_factory/lib/python3.8/site-packages/pytorch_lightning/plugins/__init__.py)
System:
I'm running on a Mac studio with MacOS Ventura 13.3.1.
Here are the versions installed by pip -r requirements.txt:
> pip3 install -r requirements.txt
...
Successfully built torch-scatter torch-efficient-distloss pathtools
Installing collected packages: multidict, frozenlist, yarl, smmap, attrs, async-timeout, aiosignal, soupsieve, packaging, gitdb, fsspec, aiohttp, tqdm, torchmetrics, tifffile, setproctitle, sentry-sdk, scipy, PyYAML, PyWavelets, psutil, protobuf, pathtools, networkx, lightning-utilities, lazy-loader, imageio, GitPython, filelock, docker-pycreds, Click, beautifulsoup4, appdirs, wandb, torch-scatter, torch-efficient-distloss, scikit-image, pytorch-lightning, piqa, opencv-python, ninja, imageio-ffmpeg, gin-config, gdown, functorch, configargparse
Successfully installed Click-8.1.3 GitPython-3.1.31 PyWavelets-1.4.1 PyYAML-6.0 aiohttp-3.8.4 aiosignal-1.3.1 appdirs-1.4.4 async-timeout-4.0.2 attrs-22.2.0 beautifulsoup4-4.12.2 configargparse-1.5.3 docker-pycreds-0.4.0 filelock-3.11.0 frozenlist-1.3.3 fsspec-2023.4.0 functorch-0.1.1 gdown-4.7.1 gin-config-0.5.0 gitdb-4.0.10 imageio-2.27.0 imageio-ffmpeg-0.4.8 lazy-loader-0.2 lightning-utilities-0.8.0 multidict-6.0.4 networkx-3.1 ninja-1.11.1 opencv-python-4.7.0.72 packaging-23.1 pathtools-0.1.2 piqa-1.2.2 protobuf-4.22.3 psutil-5.9.4 pytorch-lightning-2.0.1.post0 scikit-image-0.20.0 scipy-1.9.1 sentry-sdk-1.19.1 setproctitle-1.3.2 smmap-5.0.0 soupsieve-2.4 tifffile-2023.4.12 torch-efficient-distloss-0.1.3 torch-scatter-2.1.1 torchmetrics-0.11.4 tqdm-4.65.0 wandb-0.14.2 yarl-1.8.2
Similar problems found in:
https://github.com/Lightning-AI/lightning/issues/17191
That hints to this migration guide:
https://lightning.ai/docs/pytorch/stable/upgrade/migration_guide.html
Possible temporary solution
Use pytorch_lightning==1.9.5
in requirements.txt
I have same issue!
After I change my pytorch_lightning==1.9.5, then 'BatchSampler' error occurs. Last part of the error message is as follows.
In call to configurable 'run' (<function run at 0x7f7fb3ef59d0>)
dataloader = _update_dataloader(dataloader, sampler, mode=mode)
File "/compuworks/anaconda3/envs/nerf_factory/lib/python3.9/site-packages/pytorch_lightning/utilities/data.py", line 157, in _update_dataloader
dl_args, dl_kwargs = _get_dataloader_init_args_and_kwargs(dataloader, sampler, mode)
File "/compuworks/anaconda3/envs/nerf_factory/lib/python3.9/site-packages/pytorch_lightning/utilities/data.py", line 218, in _get_dataloader_init_args_and_kwargs
dl_kwargs.update(_dataloader_init_kwargs_resolve_sampler(dataloader, sampler, mode, disallow_batch_sampler))
File "/compuworks/anaconda3/envs/nerf_factory/lib/python3.9/site-packages/pytorch_lightning/utilities/data.py", line 342, in _dataloader_init_kwargs_resolve_sampler
raise MisconfigurationException(
lightning_fabric.utilities.exceptions.MisconfigurationException: We tried to re-instantiate your custom batch sampler and failed. To mitigate this, either follow the API of `BatchSampler` or instantiate your custom batch sampler inside `*_dataloader` hooks of your module.
In call to configurable 'run' (<function run at 0x7f66310e29d0>)
Did you solved??
I have encountered the same problem before, until I installed Pytorch_ Righting=1.6.0, this issue has been resolved.