[BUG]: ChunkManager.__init__() takes from 2 to 3 positional arguments but 5 were given
🐛 Describe the bug
Hi, I'm trying to finetune stable diffusion using the example script in the repo. The ChunkManager.init function is being passed the wrong args from the PyTorchLightning ColossalAI Strategy file.
Traceback (most recent call last):
File "main.py", line 808, in <module>
trainer.fit(model, data)
File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 579, in fit
call._call_and_handle_interrupt(
File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 36, in _call_and_handle_interrupt
return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 90, in launch
return function(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 621, in _fit_impl
self._run(model, ckpt_path=self.ckpt_path)
File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1039, in _run
self.strategy.setup(self)
File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/strategies/colossalai.py", line 333, in setup
self.setup_precision_plugin()
File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/strategies/colossalai.py", line 275, in setup_precision_plugin
chunk_manager = ChunkManager(
TypeError: __init__() takes from 2 to 3 positional arguments but 5 were given
Environment
CUDA: 11.8 PyTorch: 1.13.0 Built the ColossalAI package from source
Hi @salmanshah1d,
I believe this issue is same as #1872.
Hi @1SAA, thanks so much for your response. #1872 mentions to install via pip install colossalai==0.1.10+torch1.11cu11.3 -f https://release.colossalai.org. Does my environment need to match Pytorch versions 1.11 and CUDA 11.3?
These versions are fairly old, so would ideally like to use the latest versions (which I think are Pytorch 1.13 and CUDA 11.8).
Do you have any advice for setting up / resolving those requirements? I currently use the NVIDIA NGC Docker images, but do you have any other suggestions?
i have the same problem ,and my cuda version is 11.6
@1SAA i have the same problem ,and my cuda version is 11.6
We have updated a lot. This issue was closed due to inactivity. Thanks.