diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

Model train is not effective

Open HarshkumarDegamadiya opened this issue 2 years ago • 12 comments

Describe the bug

When i trained model with my image the grid was showing worst resul and when i generated image , it is showing me my image only.. No matter what prompt i use , it is showing same potrait image

Reproduction

No response

Logs

No response

System Info

Colab

HarshkumarDegamadiya avatar Jan 05 '23 19:01 HarshkumarDegamadiya

well, something major got changed on diffusers side, it broke entire dreambooth training from what ive seen. today, it was working fine 2 days ago

2blackbar avatar Jan 05 '23 22:01 2blackbar

Even with https://github.com/ShivamShrirao/diffusers/pull/178 ?

ShivamShrirao avatar Jan 05 '23 22:01 ShivamShrirao

yes even with the #178 .

HarshkumarDegamadiya avatar Jan 05 '23 23:01 HarshkumarDegamadiya

yes mine just crapped when trying to generate images at 500 steps , we have to wait for a fix, even my saved colab notebooks didnt help cause code is being downloaded from here anyway , thats why i dont like python, one dependency gets "update" and everything else craps down like a house of cards Imo when setting up dependencies people should always use specific version numbers for eveyrthing so it wont break that easily

Steps: 20% 500/2500 [07:16<28:25, 1.17it/s, loss=0.281, lr=1.2e-6]Traceback (most recent call last): File "train_dreambooth.py", line 822, in <module> main(args) File "train_dreambooth.py", line 805, in main save_weights(global_step) File "train_dreambooth.py", line 682, in save_weights text_enc_model = accelerator.unwrap_model(text_encoder, keep_fp32_wrapper=True) TypeError: unwrap_model() got an unexpected keyword argument 'keep_fp32_wrapper' Steps: 20% 500/2500 [07:16<29:07, 1.14it/s, loss=0.281, lr=1.2e-6] Traceback (most recent call last): File "/usr/local/bin/accelerate", line 8, in <module> sys.exit(main()) File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/accelerate_cli.py", line 43, in main args.func(args) File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/launch.py", line 837, in launch_command simple_launcher(args) File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/launch.py", line 354, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)

2blackbar avatar Jan 05 '23 23:01 2blackbar

i think it is the error

/usr/local/lib/python3.8/dist-packages/bitsandbytes/cuda_setup/paths.py:105: UserWarning: /usr/lib64-nvidia did not contain libcudart.so as expected! Searching further paths... warn( /usr/local/lib/python3.8/dist-packages/bitsandbytes/cuda_setup/paths.py:27: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('--listen_host=172.28.0.12 --target_host=172.28.0.12 --tunnel_background_save_url=https'), PosixPath('//colab.research.google.com/tun/m/cc48301118ce562b961b3c22d803539adc1e0c19/gpu-t4-s-2vzb2yx032dzy --tunnel_background_save_delay=10s --tunnel_periodic_background_save_frequency=30m0s --enable_output_coalescing=true --output_coalescing_required=true')} warn( /usr/local/lib/python3.8/dist-packages/bitsandbytes/cuda_setup/paths.py:27: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('6000,"kernelManagerProxyHost"'), PosixPath('{"kernelManagerProxyPort"'), PosixPath('"172.28.0.12","jupyterArgs"'), PosixPath('["--ip=172.28.0.12","--transport=ipc"],"debugAdapterMultiplexerPath"'), PosixPath('true}'), PosixPath('"/usr/local/bin/dap_multiplexer","enableLsp"')} warn( /usr/local/lib/python3.8/dist-packages/bitsandbytes/cuda_setup/paths.py:27: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('module'), PosixPath('//ipykernel.pylab.backend_inline')} warn( /usr/local/lib/python3.8/dist-packages/bitsandbytes/cuda_setup/paths.py:27: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/env/python')} warn( /usr/local/lib/python3.8/dist-packages/bitsandbytes/cuda_setup/paths.py:27: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/sys/fs/cgroup/memory.events /var/colab/cgroup/jupyter-children/memory.events')} warn( CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64... CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so CUDA SETUP: Highest compute capability among GPUs detected: 7.5 CUDA SETUP: Detected CUDA version 112 CUDA SETUP: Loading binary /usr/local/lib/python3.8/dist-packages/bitsandbytes/libbitsandbytes_cuda112.so... /usr/local/lib/python3.8/dist-packages/diffusers/utils/deprecation_utils.py:35: FutureWarning: It is deprecated to pass a pretrained model name or path to from_config.If you were trying to load a scheduler, please use <class 'diffusers.schedulers.scheduling_ddpm.DDPMScheduler'>.from_pretrained(...) instead. Otherwise, please make sure to pass a configuration dictionary instead. This functionality will be removed in v1.0.0. warnings.warn(warning + message, FutureWarning) Downloading: 100% 308/308 [00:00<00:00, 251kB/s] Caching latents: 100% 50/50 [00:11<00:00, 4.38it/s] Steps: 100% 800/800 [11:49<00:00, 1.15it/s, loss=0.278, lr=1e-6]Traceback (most recent call last): File "train_dreambooth.py", line 822, in main(args) File "train_dreambooth.py", line 815, in main save_weights(global_step) File "train_dreambooth.py", line 682, in save_weights text_enc_model = accelerator.unwrap_model(text_encoder, keep_fp32_wrapper=True) TypeError: unwrap_model() got an unexpected keyword argument 'keep_fp32_wrapper' Steps: 100% 800/800 [11:49<00:00, 1.13it/s, loss=0.278, lr=1e-6] Traceback (most recent call last): File "/usr/local/bin/accelerate", line 8, in sys.exit(main()) File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/accelerate_cli.py", line 43, in main args.func(args) File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/launch.py", line 837, in launch_command simple_launcher(args) File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/launch.py", line 354, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['/usr/bin/python3', 'train_dreambooth.py', '--pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5', '--pretrained_vae_name_or_path=stabilityai/sd-vae-ft-mse', '--output_dir=/content/stable_diffusion_weights/harsh23', '--revision=fp16', '--with_prior_preservation', '--prior_loss_weight=1.0', '--seed=1337', '--resolution=512', '--train_batch_size=1', '--train_text_encoder', '--mixed_precision=fp16', '--use_8bit_adam', '--gradient_accumulation_steps=1', '--learning_rate=1e-6', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--num_class_images=50', '--sample_batch_size=4', '--max_train_steps=800', '--save_interval=10000', '--save_sample_prompt=photo of harsh23 person', '--concepts_list=concepts_list.json']' returned non-zero exit status 1.

HarshkumarDegamadiya avatar Jan 05 '23 23:01 HarshkumarDegamadiya

Yep I am having the same issue. Was working fine yesterday but today the results are terrible. Getting the same CUDA_SETUP error, then it still trains but it isn't working properly.

tchesket avatar Jan 06 '23 03:01 tchesket

The error message says "TypeError: unwrap_model() got an unexpected keyword argument 'keep_fp32_wrapper'" So I removed keep_fp32_wrapper from unwrap_model and it worked. #181

KitaharaMugiro avatar Jan 06 '23 08:01 KitaharaMugiro

I closed #181 because the issue is solved by updating 'accelerate' library

KitaharaMugiro avatar Jan 07 '23 04:01 KitaharaMugiro

I imagine this is related to why a derivative dreambooth notebook I'm using is failing. "train_dreambooth.py" is just not saving the final output directory for me anymore (I'm expecting a "0" folder and a "800" folder because I'm using 800 steps, but there's only a "0" directory after it's done running).

maxdaneau avatar Jan 08 '23 17:01 maxdaneau

I am having the same issue, I'm able to train in my colab notebook but the prompted outputs do not look any different from the un-prompted generation of the fine-tuned model. This is after I removed the "keep_fp32_wrapper" parameter.

I am also seeing the bitsandbytes error, so I'm guessing it's related to that as well.

enoreyes avatar Jan 15 '23 02:01 enoreyes

@enoreyes it's related to bitsandbytes. Uninstall your mistakes and install the correct version.

ShivamShrirao avatar Jan 15 '23 03:01 ShivamShrirao

By "correct version", you mean 0.35.4 correct? I was able to get it working using that version.

enoreyes avatar Jan 15 '23 04:01 enoreyes