diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

can't pickle local object in train_dreambooth.py

Open vibber opened this issue 2 years ago • 3 comments

Describe the bug

When trying dream booth training I get this error

Reproduction

accelerate launch train_dreambooth.py --mixed_precision="fp16" --pretrained_model_name_or_path="stabilityai/stable-diffusion-2-1-base" --train_text_encoder --instance_data_dir="C:\Users\VERTIGO\Documents\my_concept" --class_data_dir="C:\Users\VERTIGO\Documents\classdata" --output_dir="C:\Users\VERTIGO\dreambooth-concept" --with_prior_preservation --prior_loss_weight=1.0 --instance_prompt="a graphical illustration by oritoordream artist" --class_prompt="a graphical illustration by an artist" --resolution=512 --train_batch_size=1 --gradient_accumulation_steps=1 --learning_rate=1e-6 --lr_scheduler="constant" --lr_warmup_steps=0 --num_class_images=200 --max_train_steps=2000

Logs

Steps:   0%|                                                                                  | 0/2000 [00:00<?, ?it/s]╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ <string>:1 in <module>                                                                           │
│                                                                                                  │
│ c:\users\vertigo\anaconda3\envs\dreamboothdepth\lib\multiprocessing\spawn.py:116 in spawn_main   │
│                                                                                                  │
│   113 │   │   resource_tracker._resource_tracker._fd = tracker_fd                                │
│   114 │   │   fd = pipe_handle                                                                   │
│   115 │   │   parent_sentinel = os.dup(pipe_handle)                                              │
│ ❱ 116 │   exitcode = _main(fd, parent_sentinel)                                                  │
│   117 │   sys.exit(exitcode)                                                                     │
│   118                                                                                            │
│   119                                                                                            │
│                                                                                                  │
│ c:\users\vertigo\anaconda3\envs\dreamboothdepth\lib\multiprocessing\spawn.py:126 in _main        │
│                                                                                                  │
│   123 │   │   try:                                                                               │
│   124 │   │   │   preparation_data = reduction.pickle.load(from_parent)                          │
│   125 │   │   │   prepare(preparation_data)                                                      │
│ ❱ 126 │   │   │   self = reduction.pickle.load(from_parent)                                      │
│   127 │   │   finally:                                                                           │
│   128 │   │   │   del process.current_process()._inheriting                                      │
│   129 │   return self._bootstrap(parent_sentinel)                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
EOFError: Ran out of input
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ C:\Users\VERTIGO\diffusers\examples\dreambooth\train_dreambooth.py:770 in <module>               │
│                                                                                                  │
│   767                                                                                            │
│   768 if __name__ == "__main__":                                                                 │
│   769 │   args = parse_args()                                                                    │
│ ❱ 770 │   main(args)                                                                             │
│   771                                                                                            │
│                                                                                                  │
│ C:\Users\VERTIGO\diffusers\examples\dreambooth\train_dreambooth.py:667 in main                   │
│                                                                                                  │
│   664 │   │   unet.train()                                                                       │
│   665 │   │   if args.train_text_encoder:                                                        │
│   666 │   │   │   text_encoder.train()                                                           │
│ ❱ 667 │   │   for step, batch in enumerate(train_dataloader):                                    │
│   668 │   │   │   # Skip steps until we reach the resumed step                                   │
│   669 │   │   │   if args.resume_from_checkpoint and epoch == first_epoch and step < resume_st   │
│   670 │   │   │   │   if step % args.gradient_accumulation_steps == 0:                           │
│                                                                                                  │
│ c:\users\vertigo\anaconda3\envs\dreamboothdepth\lib\site-packages\accelerate\data_loader.py:372  │
│ in __iter__                                                                                      │
│                                                                                                  │
│   369 │   │   with suppress(Exception):                                                          │
│   370 │   │   │   length = getattr(self.dataset, "total_dataset_length", len(self.dataset))      │
│   371 │   │   │   self.gradient_state._set_remainder(length % self.total_batch_size)             │
│ ❱ 372 │   │   dataloader_iter = super().__iter__()                                               │
│   373 │   │   # We iterate one batch ahead to check when we are at the end                       │
│   374 │   │   try:                                                                               │
│   375 │   │   │   current_batch = next(dataloader_iter)                                          │
│                                                                                                  │
│ c:\users\vertigo\anaconda3\envs\dreamboothdepth\lib\site-packages\torch\utils\data\dataloader.py │
│ :444 in __iter__                                                                                 │
│                                                                                                  │
│    441 │   │   │   │   self._iterator._reset(self)                                               │
│    442 │   │   │   return self._iterator                                                         │
│    443 │   │   else:                                                                             │
│ ❱  444 │   │   │   return self._get_iterator()                                                   │
│    445 │                                                                                         │
│    446 │   @property                                                                             │
│    447 │   def _auto_collation(self):                                                            │
│                                                                                                  │
│ c:\users\vertigo\anaconda3\envs\dreamboothdepth\lib\site-packages\torch\utils\data\dataloader.py │
│ :390 in _get_iterator                                                                            │
│                                                                                                  │
│    387 │   │   │   return _SingleProcessDataLoaderIter(self)                                     │
│    388 │   │   else:                                                                             │
│    389 │   │   │   self.check_worker_number_rationality()                                        │
│ ❱  390 │   │   │   return _MultiProcessingDataLoaderIter(self)                                   │
│    391 │                                                                                         │
│    392 │   @property                                                                             │
│    393 │   def multiprocessing_context(self):                                                    │
│                                                                                                  │
│ c:\users\vertigo\anaconda3\envs\dreamboothdepth\lib\site-packages\torch\utils\data\dataloader.py │
│ :1077 in __init__                                                                                │
│                                                                                                  │
│   1074 │   │   │   #     it started, so that we do not call .join() if program dies              │
│   1075 │   │   │   #     before it starts, and __del__ tries to join but will get:               │
│   1076 │   │   │   #     AssertionError: can only join a started process.                        │
│ ❱ 1077 │   │   │   w.start()                                                                     │
│   1078 │   │   │   self._index_queues.append(index_queue)                                        │
│   1079 │   │   │   self._workers.append(w)                                                       │
│   1080                                                                                           │
│                                                                                                  │
│ c:\users\vertigo\anaconda3\envs\dreamboothdepth\lib\multiprocessing\process.py:121 in start      │
│                                                                                                  │
│   118 │   │   assert not _current_process._config.get('daemon'), \                               │
│   119 │   │   │      'daemonic processes are not allowed to have children'                       │
│   120 │   │   _cleanup()                                                                         │
│ ❱ 121 │   │   self._popen = self._Popen(self)                                                    │
│   122 │   │   self._sentinel = self._popen.sentinel                                              │
│   123 │   │   # Avoid a refcycle if the target function holds an indirect                        │
│   124 │   │   # reference to the process object (see bpo-30775)                                  │
│                                                                                                  │
│ c:\users\vertigo\anaconda3\envs\dreamboothdepth\lib\multiprocessing\context.py:224 in _Popen     │
│                                                                                                  │
│   221 │   _start_method = None                                                                   │
│   222 │   @staticmethod                                                                          │
│   223 │   def _Popen(process_obj):                                                               │
│ ❱ 224 │   │   return _default_context.get_context().Process._Popen(process_obj)                  │
│   225                                                                                            │
│   226 class DefaultContext(BaseContext):                                                         │
│   227 │   Process = Process                                                                      │
│                                                                                                  │
│ c:\users\vertigo\anaconda3\envs\dreamboothdepth\lib\multiprocessing\context.py:327 in _Popen     │
│                                                                                                  │
│   324 │   │   @staticmethod                                                                      │
│   325 │   │   def _Popen(process_obj):                                                           │
│   326 │   │   │   from .popen_spawn_win32 import Popen                                           │
│ ❱ 327 │   │   │   return Popen(process_obj)                                                      │
│   328 │                                                                                          │
│   329 │   class SpawnContext(BaseContext):                                                       │
│   330 │   │   _name = 'spawn'                                                                    │
│                                                                                                  │
│ c:\users\vertigo\anaconda3\envs\dreamboothdepth\lib\multiprocessing\popen_spawn_win32.py:93 in   │
│ __init__                                                                                         │
│                                                                                                  │
│    90 │   │   │   set_spawning_popen(self)                                                       │
│    91 │   │   │   try:                                                                           │
│    92 │   │   │   │   reduction.dump(prep_data, to_child)                                        │
│ ❱  93 │   │   │   │   reduction.dump(process_obj, to_child)                                      │
│    94 │   │   │   finally:                                                                       │
│    95 │   │   │   │   set_spawning_popen(None)                                                   │
│    96                                                                                            │
│                                                                                                  │
│ c:\users\vertigo\anaconda3\envs\dreamboothdepth\lib\multiprocessing\reduction.py:60 in dump      │
│                                                                                                  │
│    57                                                                                            │
│    58 def dump(obj, file, protocol=None):                                                        │
│    59 │   '''Replacement for pickle.dump() using ForkingPickler.'''                              │
│ ❱  60 │   ForkingPickler(file, protocol).dump(obj)                                               │
│    61                                                                                            │
│    62 #                                                                                          │
│    63 # Platform specific definitions                                                            │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
AttributeError: Can't pickle local object 'main.<locals>.<lambda>'
Steps:   0%|                                                                                  | 0/2000 [00:02<?, ?it/s]
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ c:\users\vertigo\anaconda3\envs\dreamboothdepth\lib\runpy.py:194 in _run_module_as_main          │
│                                                                                                  │
│   191 │   main_globals = sys.modules["__main__"].__dict__                                        │
│   192 │   if alter_argv:                                                                         │
│   193 │   │   sys.argv[0] = mod_spec.origin                                                      │
│ ❱ 194 │   return _run_code(code, main_globals, None,                                             │
│   195 │   │   │   │   │    "__main__", mod_spec)                                                 │
│   196                                                                                            │
│   197 def run_module(mod_name, init_globals=None,                                                │
│                                                                                                  │
│ c:\users\vertigo\anaconda3\envs\dreamboothdepth\lib\runpy.py:87 in _run_code                     │
│                                                                                                  │
│    84 │   │   │   │   │      __loader__ = loader,                                                │
│    85 │   │   │   │   │      __package__ = pkg_name,                                             │
│    86 │   │   │   │   │      __spec__ = mod_spec)                                                │
│ ❱  87 │   exec(code, run_globals)                                                                │
│    88 │   return run_globals                                                                     │
│    89                                                                                            │
│    90 def _run_module_code(code, init_globals=None,                                              │
│                                                                                                  │
│ C:\Users\VERTIGO\anaconda3\envs\dreamboothdepth\Scripts\accelerate.exe\__main__.py:7 in <module> │
│                                                                                                  │
│ [Errno 2] No such file or directory:                                                             │
│ 'C:\\Users\\VERTIGO\\anaconda3\\envs\\dreamboothdepth\\Scripts\\accelerate.exe\\__main__.py'     │
│                                                                                                  │
│ c:\users\vertigo\anaconda3\envs\dreamboothdepth\lib\site-packages\accelerate\commands\accelerate │
│ _cli.py:45 in main                                                                               │
│                                                                                                  │
│   42 │   │   exit(1)                                                                             │
│   43 │                                                                                           │
│   44 │   # Run                                                                                   │
│ ❱ 45 │   args.func(args)                                                                         │
│   46                                                                                             │
│   47                                                                                             │
│   48 if __name__ == "__main__":                                                                  │
│                                                                                                  │
│ c:\users\vertigo\anaconda3\envs\dreamboothdepth\lib\site-packages\accelerate\commands\launch.py: │
│ 1104 in launch_command                                                                           │
│                                                                                                  │
│   1101 │   elif defaults is not None and defaults.compute_environment == ComputeEnvironment.AMA  │
│   1102 │   │   sagemaker_launcher(defaults, args)                                                │
│   1103 │   else:                                                                                 │
│ ❱ 1104 │   │   simple_launcher(args)                                                             │
│   1105                                                                                           │
│   1106                                                                                           │
│   1107 def main():                                                                               │
│                                                                                                  │
│ c:\users\vertigo\anaconda3\envs\dreamboothdepth\lib\site-packages\accelerate\commands\launch.py: │
│ 567 in simple_launcher                                                                           │
│                                                                                                  │
│    564 │   process = subprocess.Popen(cmd, env=current_env)                                      │
│    565 │   process.wait()                                                                        │
│    566 │   if process.returncode != 0:                                                           │
│ ❱  567 │   │   raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)       │
│    568                                                                                           │
│    569                                                                                           │
│    570 def multi_gpu_launcher(args):                                                             │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯

System Info

  • diffusers version: 0.11.0.dev0
  • Platform: Windows-10-10.0.19041-SP0
  • Python version: 3.8.5
  • PyTorch version (GPU?): 1.12.1 (True)
  • Huggingface_hub version: 0.11.1
  • Transformers version: 4.26.0.dev0
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

vibber avatar Dec 17 '22 14:12 vibber

accelerate test runs fine btw

vibber avatar Dec 17 '22 14:12 vibber

@vibber see "workaround" section in related issue

collate_fn is now declared as global, but lambda function is passed as argument instead. Lambdas can't be pickled either :(

Dragollla avatar Dec 18 '22 17:12 Dragollla

Sorry just to better understand, are you running on multiple GPUs / multiple processes?

patrickvonplaten avatar Dec 19 '22 23:12 patrickvonplaten

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Jan 16 '23 15:01 github-actions[bot]