Dreambooth-Stable-Diffusion RuntimeError: DistributedDataParallel is not needed when a module doesn't have any parameter that requires a gradient.

Hello,

Very excited to see that a Dreambooth implementation is already available for SD! Great work.

I'm trying to run the training script on a 3090, so I switched "unfreeze_model" from True to False in v1-finetune_unfrozen.yaml (I don't think this GPU has enough VRAM for the unfrozen model) - unfortunately this yields the following error:

Summoning checkpoint.

Traceback (most recent call last):
  File "main.py", line 831, in <module>
    trainer.fit(model, data)
  File "T:\programs\anaconda3\envs\textualinversion\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 740, in fit
    self._call_and_handle_interrupt(
  File "T:\programs\anaconda3\envs\textualinversion\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 685, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "T:\programs\anaconda3\envs\textualinversion\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 777, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "T:\programs\anaconda3\envs\textualinversion\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1188, in _run
    self._pre_dispatch()
  File "T:\programs\anaconda3\envs\textualinversion\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1223, in _pre_dispatch
    self.accelerator.pre_dispatch(self)
  File "T:\programs\anaconda3\envs\textualinversion\lib\site-packages\pytorch_lightning\accelerators\accelerator.py", line 136, in pre_dispatch
    self.training_type_plugin.pre_dispatch()
  File "T:\programs\anaconda3\envs\textualinversion\lib\site-packages\pytorch_lightning\plugins\training_type\ddp.py", line 394, in pre_dispatch
    self.configure_ddp()
  File "T:\programs\anaconda3\envs\textualinversion\lib\site-packages\pytorch_lightning\plugins\training_type\ddp.py", line 371, in configure_ddp
    self._model = self._setup_model(LightningDistributedModule(self.model))
  File "T:\programs\anaconda3\envs\textualinversion\lib\site-packages\pytorch_lightning\plugins\training_type\ddp.py", line 189, in _setup_model
    return DistributedDataParallel(module=model, device_ids=self.determine_ddp_device_ids(), **self._ddp_kwargs)
  File "T:\programs\anaconda3\envs\textualinversion\lib\site-packages\torch\nn\parallel\distributed.py", line 477, in __init__
    self._log_and_throw(
  File "T:\programs\anaconda3\envs\textualinversion\lib\site-packages\torch\nn\parallel\distributed.py", line 604, in _log_and_throw
    raise err_type(err_msg)
RuntimeError: DistributedDataParallel is not needed when a module doesn't have any parameter that requires a gradient.

Any advice? Thank you!

Sep 06 '22 11:09 ThereforeGames

Hi,

We need to set unfreeze_model to be True because we are optimizing the model itself, unlike in textual inversion where we only optimize the embedding. If you make it to False, then as the error message suggests, there is no parameter to optimize.

However, I admit that this will take more memory than TI. I have two V100 GPUs, each with 32 GB memory (which is the setting in the TI paper) and on each GPU, seems like it takes ~26-27 Gbs of memory, so I am afraid that a single 3090 won't be enough. Unfortunately I have no idea how to optimize that. Maybe you can try to make the training mixed precision, or use a GPU cloud with at least one GPU with 32 GB memory.

Sep 06 '22 16:09 XavierXiao

cant this work like inversion via embedding files and require less ram? Whats the advantage here? Hard to tell without any results that give regular inversion a hard time , for example a subject like robocop which has a lot of details to learn and then resynhesize, inversion fails flat on it, how this code would fare ? Any ideas ?

Sep 07 '22 01:09 1blackbar

We need to set unfreeze_model to be True because we are optimizing the model itself, unlike in textual inversion where we only optimize the embedding. If you make it to False, then as the error message suggests, there is no parameter to optimize.

However, I admit that this will take more memory than TI. I have two V100 GPUs, each with 32 GB memory (which is the setting in the TI paper) and on each GPU, seems like it takes ~26-27 Gbs of memory, so I am afraid that a single 3090 won't be enough. Unfortunately I have no idea how to optimize that. Maybe you can try to make the training mixed precision, or use a GPU cloud with at least one GPU with 32 GB memory.

I see, thanks for the detailed response! I'll try the mixed precision idea and report back.

for example a subject like robocop which has a lot of details to learn and then resynhesize, inversion fails flat on it, how this code would fare ? Any ideas ?

Check the official DreamBooth page for examples: https://dreambooth.github.io/

I presume this implementation would yield similar results. DreamBooth appears to be better than Textual Inversion at characters in terms of variety, and possibly pose/facial expression variety. Textual Inversion, on the other hand, is probably better equipped at style transfer.

Sep 07 '22 09:09 ThereforeGames

Well, I read that paper first day it was released, i evn posted on theiir repo cause its very impressive that it retains identity and details of the subjects like fur patterns , so it needs at least 27GB of ram , right? So colab gives 25GB in pro, can this repo be edited to squeeze trainng into 25 ? For now im training embeddings and do pretty well , examples with and without embedding on van damme. I cant do robocop tho, with faces it works well, maybe this one would do more complex subjects but ttrash container is not really indicator that would convince me to stash $$ at extra gpu ram to rent out and even that result is not as consistent as actual dreambooth results.So... id wait for some results from people who can run it.

Sep 07 '22 10:09 1blackbar

@1blackbar is there a link to this colab notebook with dreambooth?

Sep 09 '22 03:09 Marcus-Arcadius

@1blackbar is there a link to this colab notebook with dreambooth?

Doesn't exist.

Sep 27 '22 07:09 TemporalLabsLLC-SOL

Dreambooth-Stable-Diffusion Dreambooth-Stable-Diffusion copied to clipboard

RuntimeError: DistributedDataParallel is not needed when a module doesn't have any parameter that requires a gradient.

Dreambooth-Stable-Diffusion
Dreambooth-Stable-Diffusion copied to clipboard