kohya_ss icon indicating copy to clipboard operation
kohya_ss copied to clipboard

raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)

Open Amei11111 opened this issue 2 years ago • 8 comments

I am new to dreambooth so I am following this video but after i press train button The error message "raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)"

Load CSS... Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch(). Folder 3_SATONO: 30 steps max_train_steps = 30 stop_text_encoder_training = 0 lr_warmup_steps = 3 accelerate launch --num_cpu_threads_per_process=2 "train_network.py" --v2 --enable_bucket --pretrained_model_name_or_path="C:/AIIIIIIIIIIIIIIIIIIIIIIII/models/Stable-diffusion/chilloutmix_NiPrunedFp32Fix.safetensors" --train_data_dir="C:/Users/q0989/Desktop/1/model" --resolution=512,512 --output_dir="C:/Users/q0989/Desktop/1/destination" --logging_dir="" --network_alpha="1" --save_model_as=safetensors --network_module=networks.lora --text_encoder_lr=5e-5 --unet_lr=0.0001 --network_dim=8 --output_name="last" --lr_scheduler_num_cycles="1" --learning_rate="0.0001" --lr_scheduler="cosine" --lr_warmup_steps="3" --train_batch_size="1" --max_train_steps="30" --save_every_n_epochs="1" --mixed_precision="fp16" --save_precision="fp16" --seed="1234" --cache_latents --bucket_reso_steps=64 --xformers --use_8bit_adam --bucket_no_upscale prepare tokenizer Use DreamBooth method. prepare train images. found directory 3_SATONO contains 10 image files 30 train images with repeating. loading image sizes. 100%|████████████████████████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 1999.95it/s] make buckets min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is set, because bucket reso is defined by image size automatically / bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計算されるため、min_bucket_resoとmax_bucket_resoは無視されます number of images (including repeats) / 各bucketの画像枚数(繰り返し回数を含む) bucket 0: resolution (512, 512), count: 30 mean ar error (without repeats): 0.0 prepare accelerator Using accelerator 0.15.0 or above. load StableDiffusion checkpoint Traceback (most recent call last): File "C:\Users\q0989\Desktop\1\kohya_ss\train_network.py", line 573, in train(args) File "C:\Users\q0989\Desktop\1\kohya_ss\train_network.py", line 158, in train text_encoder, vae, unet, _ = train_util.load_target_model(args, weight_dtype) File "C:\Users\q0989\Desktop\1\kohya_ss\library\train_util.py", line 1584, in load_target_model text_encoder, vae, unet = model_util.load_models_from_stable_diffusion_checkpoint(args.v2, args.pretrained_model_name_or_path) File "C:\Users\q0989\Desktop\1\kohya_ss\library\model_util.py", line 880, in load_models_from_stable_diffusion_checkpoint info = unet.load_state_dict(converted_unet_checkpoint) File "C:\Users\q0989\Desktop\1\kohya_ss\venv\lib\site-packages\torch\nn\modules\module.py", line 1604, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for UNet2DConditionModel: size mismatch for down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]). size mismatch for down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]). size mismatch for down_blocks.0.attentions.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]). size mismatch for down_blocks.0.attentions.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]). size mismatch for down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]). size mismatch for down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]). size mismatch for down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]). size mismatch for down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]). size mismatch for down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]). size mismatch for down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]). size mismatch for down_blocks.2.attentions.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]). size mismatch for down_blocks.2.attentions.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]). size mismatch for up_blocks.1.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]). size mismatch for up_blocks.1.attentions.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]). size mismatch for up_blocks.1.attentions.2.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]). size mismatch for up_blocks.2.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]). size mismatch for up_blocks.2.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]). size mismatch for up_blocks.2.attentions.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]). size mismatch for up_blocks.2.attentions.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]). size mismatch for up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]). size mismatch for up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([640, 1024]). size mismatch for up_blocks.3.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]). size mismatch for up_blocks.3.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]). size mismatch for up_blocks.3.attentions.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]). size mismatch for up_blocks.3.attentions.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]). size mismatch for up_blocks.3.attentions.2.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]). size mismatch for up_blocks.3.attentions.2.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([320, 1024]). size mismatch for mid_block.attentions.0.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]). size mismatch for mid_block.attentions.0.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([1280, 1024]). Traceback (most recent call last): File "C:\Users\q0989\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\q0989\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in run_code exec(code, run_globals) File "C:\Users\q0989\Desktop\1\kohya_ss\venv\Scripts\accelerate.exe_main.py", line 7, in File "C:\Users\q0989\Desktop\1\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main args.func(args) File "C:\Users\q0989\Desktop\1\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command simple_launcher(args) File "C:\Users\q0989\Desktop\1\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['C:\Users\q0989\Desktop\1\kohya_ss\venv\Scripts\python.exe', 'train_network.py', '--v2', '--enable_bucket', '--pretrained_model_name_or_path=C:/AIIIIIIIIIIIIIIIIIIIIIIII/models/Stable-diffusion/chilloutmix_NiPrunedFp32Fix.safetensors', '--train_data_dir=C:/Users/q0989/Desktop/1/model', '--resolution=512,512', '--output_dir=C:/Users/q0989/Desktop/1/destination', '--logging_dir=', '--network_alpha=1', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-5', '--unet_lr=0.0001', '--network_dim=8', '--output_name=last', '--lr_scheduler_num_cycles=1', '--learning_rate=0.0001', '--lr_scheduler=cosine', '--lr_warmup_steps=3', '--train_batch_size=1', '--max_train_steps=30', '--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--seed=1234', '--cache_latents', '--bucket_reso_steps=64', '--xformers', '--use_8bit_adam', '--bucket_no_upscale']' returned non-zero exit status 1.

Amei11111 avatar Feb 19 '23 05:02 Amei11111

Same here

vivexx avatar Feb 19 '23 08:02 vivexx

same

AmeliaJaneMurphy avatar Feb 19 '23 18:02 AmeliaJaneMurphy

Fixed for me by rolling back to previous a008c62 v20.7.2 release

For newbie: go into "kohya_ss" folder in explorer, right click on empty and choise "Git Bash Here", then paste git checkout a008c62

(For geting back to master brentch later when fixes and updates will come use "git checkout master")

vivexx avatar Feb 19 '23 19:02 vivexx

For newbie: go into "kohya_ss" folder in explorer, right click on empty and choise "Git Bash Here", then paste git checkout a008c62

After this I had to run

git pull

.\venv\Scripts\activate

pip install --use-pep517 --upgrade -r requirements.txt

and restart the GUI (including hard refresh of GUI in browser)

In case another newbie like me still couldn't get it to run after checking out a008c62

starpause avatar Feb 20 '23 05:02 starpause

For newbie: go into "kohya_ss" folder in explorer, right click on empty and choise "Git Bash Here", then paste git checkout a008c62

After this I had to run

git pull

.\venv\Scripts\activate

pip install --use-pep517 --upgrade -r requirements.txt

and restart the GUI (including hard refresh of GUI in browser)

In case another newbie like me still couldn't get it to run after checking out a008c62

None of this worked for me, still getting the same error

AmeliaJaneMurphy avatar Feb 22 '23 03:02 AmeliaJaneMurphy

For newbie: go into "kohya_ss" folder in explorer, right click on empty and choise "Git Bash Here", then paste git checkout a008c62

After this I had to run

git pull

.\venv\Scripts\activate

pip install --use-pep517 --upgrade -r requirements.txt

and restart the GUI (including hard refresh of GUI in browser) In case another newbie like me still couldn't get it to run after checking out a008c62

None of this worked for me, still getting the same error

Maybe you can try this way "After replacing the file, it should work immediately."

First, delete the original train_util.py file located in kohya_ss/library. Then, download the file from this blog: https://github.com/kohya-ss/sd-scripts/tree/main/library Replace it with the train_util.py file downloaded from step 2.

Amei11111 avatar Feb 22 '23 03:02 Amei11111

Prob solved after serveral steps: replace all 3 train_util.py with new version in https://github.com/kohya-ss/sd-scripts like issue https://github.com/bmaltais/kohya_ss/issues/192 mentioned cancel the default choosed box 'Don't upscale bucket resolution' choose v_parameterization instead of v_2

Pb-207 avatar Feb 25 '23 17:02 Pb-207

For newbie: go into "kohya_ss" folder in explorer, right click on empty and choise "Git Bash Here", then paste git checkout a008c62

After this I had to run

git pull

.\venv\Scripts\activate

pip install --use-pep517 --upgrade -r requirements.txt

and restart the GUI (including hard refresh of GUI in browser)

In case another newbie like me still couldn't get it to run after checking out a008c62

After doing this I got a new error:

TypeError: 'NoneType' object is not subscriptable Traceback (most recent call last): File "C:\Users\tejat\Kohya\kohya_ss\venv\lib\site-packages\gradio\routes.py", line 384, in run_predict output = await app.get_blocks().process_api( File "C:\Users\tejat\Kohya\kohya_ss\venv\lib\site-packages\gradio\blocks.py", line 1027, in process_api data = self.postprocess_data(fn_index, result["prediction"], state) File "C:\Users\tejat\Kohya\kohya_ss\venv\lib\site-packages\gradio\blocks.py", line 939, in postprocess_data if predictions[i] is components._Keywords.FINISHED_ITERATING:

What might is the cause of this?

AniMoster avatar Mar 22 '23 11:03 AniMoster