kohya_ss icon indicating copy to clipboard operation
kohya_ss copied to clipboard

I get an error when I train LORA

Open Crimsonfart opened this issue 2 years ago • 31 comments

Can someone help me. I get the following error when I Train LORA. In a Discord channel I saw that others got the exact same error.

Load CSS... Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch(). Captioning files in C:/Users/...../Documents/LORA Training Data/Test/image/100_test... .\venv\Scripts\python.exe "finetune/make_captions.py" --batch_size="1" --num_beams="1" --top_p="0.9" --max_length="75" --min_length="5" --beam_search --caption_extension=".txt" "C:/Users/...../Documents/LORA Training Data/Test/image/100_test" --caption_weights="https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_large_caption.pth" Current Working Directory is: C:\Users.....\Documents\kohya\kohya_ss load images from C:\Users.....\Documents\LORA Training Data\Test\image\100_test found 11 images. loading BLIP caption: https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_large_caption.pth Downloading (…)solve/main/vocab.txt: 100%|███████████████████████████████████████████| 232k/232k [00:00<00:00, 408kB/s] Downloading (…)okenizer_config.json: 100%|██████████████████████████████████████████| 28.0/28.0 [00:00<00:00, 28.5kB/s] Downloading (…)lve/main/config.json: 100%|█████████████████████████████████████████████| 570/570 [00:00<00:00, 286kB/s] 100%|█████████████████████████████████████████████████████████████████████████████| 1.66G/1.66G [02:23<00:00, 12.5MB/s] load checkpoint from https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_large_caption.pth BLIP loaded 100%|██████████████████████████████████████████████████████████████████████████████████| 11/11 [00:06<00:00, 1.74it/s] done! ...captioning done Folder 100_test: 1100 steps max_train_steps = 550 stop_text_encoder_training = 0 lr_warmup_steps = 55 accelerate launch --num_cpu_threads_per_process=2 "train_network.py" --enable_bucket --pretrained_model_name_or_path="C:/Apps/AI/stable-diffusion-webui/models/Stable-diffusion/realisticVisionV13_v13VAEIncluded.safetensors" --train_data_dir="C:/Users/basil/Documents/LORA Training Data/Test/image" --resolution=768,768 --output_dir="C:/Users/basil/Documents/LORA Training Data/Test/model" --logging_dir="C:/Users/basil/Documents/LORA Training Data/Test/log" --network_alpha="1" --save_model_as=safetensors --network_module=networks.lora --text_encoder_lr=5e-5 --unet_lr=0.0001 --network_dim=8 --output_name="last" --lr_scheduler_num_cycles="1" --learning_rate="0.0001" --lr_scheduler="cosine" --lr_warmup_steps="55" --train_batch_size="2" --max_train_steps="550" --save_every_n_epochs="1" --mixed_precision="fp16" --save_precision="fp16" --seed="1234" --cache_latents --bucket_reso_steps=64 --xformers --use_8bit_adam --bucket_no_upscale prepare tokenizer Use DreamBooth method. prepare train images. found directory 100_test contains 11 image files 1100 train images with repeating. loading image sizes. 100%|█████████████████████████████████████████████████████████████████████████████████| 11/11 [00:00<00:00, 923.71it/s] make buckets min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is set, because bucket reso is defined by image size automatically / bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計算されるため、min_bucket_resoとmax_bucket_resoは無視されます number of images (including repeats) / 各bucketの画像枚数(繰り返し回数を含む) bucket 0: resolution (768, 768), count: 1100 mean ar error (without repeats): 0.0 prepare accelerator Using accelerator 0.15.0 or above. load StableDiffusion checkpoint loading u-net: <All keys matched successfully> loading vae: <All keys matched successfully> Some weights of the model checkpoint at openai/clip-vit-large-patch14 were not used when initializing CLIPTextModel: ['vision_model.encoder.layers.20.self_attn.v_proj.bias', 'vision_model.encoder.layers.9.mlp.fc2.weight', 'vision_model.encoder.layers.23.mlp.fc1.bias', 'vision_model.encoder.layers.10.self_attn.out_proj.weight', 'vision_model.encoder.layers.12.self_attn.v_proj.weight', 'vision_model.encoder.layers.17.mlp.fc2.weight', 'vision_model.encoder.layers.0.layer_norm1.weight', 'vision_model.encoder.layers.2.self_attn.out_proj.weight', 'vision_model.encoder.layers.16.self_attn.out_proj.weight', 'vision_model.encoder.layers.7.layer_norm2.weight', 'vision_model.encoder.layers.6.self_attn.out_proj.weight', 'vision_model.encoder.layers.5.layer_norm1.bias', 'vision_model.encoder.layers.11.mlp.fc1.bias', 'vision_model.encoder.layers.5.self_attn.q_proj.bias', 'vision_model.encoder.layers.11.layer_norm2.bias', 'vision_model.encoder.layers.4.self_attn.v_proj.bias', 'vision_model.encoder.layers.16.mlp.fc2.bias', 'vision_model.encoder.layers.16.self_attn.out_proj.bias', 'vision_model.encoder.layers.3.self_attn.k_proj.weight', 'vision_model.encoder.layers.9.self_attn.q_proj.bias', 'vision_model.encoder.layers.21.layer_norm1.weight', 'vision_model.encoder.layers.22.self_attn.out_proj.bias', 'vision_model.encoder.layers.14.mlp.fc2.bias', 'vision_model.encoder.layers.7.self_attn.v_proj.weight', 'vision_model.encoder.layers.15.self_attn.k_proj.weight', 'vision_model.encoder.layers.13.mlp.fc1.bias', 'vision_model.encoder.layers.2.mlp.fc2.bias', 'vision_model.encoder.layers.12.mlp.fc2.weight', 'vision_model.encoder.layers.13.self_attn.out_proj.bias', 'vision_model.encoder.layers.12.self_attn.out_proj.bias', 'vision_model.encoder.layers.7.mlp.fc1.bias', 'vision_model.encoder.layers.8.self_attn.out_proj.bias', 'vision_model.encoder.layers.15.self_attn.k_proj.bias', 'vision_model.encoder.layers.13.self_attn.k_proj.bias', 'vision_model.encoder.layers.19.self_attn.out_proj.bias', 'vision_model.encoder.layers.21.mlp.fc1.weight', 'vision_model.encoder.layers.5.self_attn.v_proj.weight', 'vision_model.encoder.layers.8.mlp.fc2.weight', 'vision_model.encoder.layers.12.self_attn.v_proj.bias', 'vision_model.encoder.layers.20.self_attn.q_proj.weight', 'vision_model.encoder.layers.3.layer_norm2.bias', 'vision_model.encoder.layers.19.mlp.fc2.bias', 'vision_model.encoder.layers.7.self_attn.v_proj.bias', 'vision_model.encoder.layers.14.self_attn.k_proj.bias', 'vision_model.encoder.layers.20.self_attn.q_proj.bias', 'vision_model.encoder.layers.1.self_attn.k_proj.bias', 'vision_model.encoder.layers.10.self_attn.k_proj.weight', 'vision_model.encoder.layers.3.self_attn.v_proj.weight', 'vision_model.encoder.layers.4.self_attn.q_proj.weight', 'vision_model.encoder.layers.12.layer_norm1.weight', 'vision_model.encoder.layers.1.mlp.fc1.bias', 'vision_model.encoder.layers.23.layer_norm2.weight', 'vision_model.encoder.layers.18.layer_norm2.bias', 'vision_model.encoder.layers.16.layer_norm2.weight', 'vision_model.encoder.layers.12.self_attn.q_proj.bias', 'vision_model.encoder.layers.17.self_attn.out_proj.weight', 'visual_projection.weight', 'vision_model.encoder.layers.8.mlp.fc2.bias', 'vision_model.encoder.layers.4.layer_norm1.bias', 'vision_model.encoder.layers.6.self_attn.q_proj.weight', 'vision_model.encoder.layers.22.self_attn.out_proj.weight', 'vision_model.encoder.layers.19.mlp.fc2.weight', 'vision_model.encoder.layers.23.self_attn.q_proj.bias', 'vision_model.encoder.layers.16.mlp.fc2.weight', 'vision_model.encoder.layers.15.self_attn.v_proj.weight', 'vision_model.encoder.layers.8.self_attn.q_proj.bias', 'vision_model.encoder.layers.23.self_attn.out_proj.bias', 'vision_model.encoder.layers.3.self_attn.out_proj.weight', 'vision_model.encoder.layers.15.self_attn.v_proj.bias', 'vision_model.encoder.layers.10.self_attn.q_proj.bias', 'vision_model.encoder.layers.17.mlp.fc1.bias', 'vision_model.encoder.layers.9.self_attn.v_proj.bias', 'vision_model.encoder.layers.19.self_attn.v_proj.weight', 'vision_model.encoder.layers.2.mlp.fc1.bias', 'vision_model.encoder.layers.19.mlp.fc1.weight', 'vision_model.encoder.layers.2.self_attn.k_proj.bias', 'vision_model.encoder.layers.19.layer_norm2.bias', 'vision_model.encoder.layers.7.layer_norm1.weight', 'vision_model.encoder.layers.12.layer_norm2.bias', 'vision_model.encoder.layers.9.self_attn.out_proj.bias', 'vision_model.encoder.layers.14.layer_norm2.bias', 'vision_model.encoder.layers.2.layer_norm1.bias', 'vision_model.encoder.layers.5.mlp.fc1.weight', 'vision_model.encoder.layers.16.self_attn.v_proj.bias', 'vision_model.encoder.layers.3.mlp.fc1.weight', 'vision_model.encoder.layers.17.layer_norm1.weight', 'vision_model.encoder.layers.12.mlp.fc1.bias', 'vision_model.encoder.layers.10.mlp.fc2.weight', 'vision_model.encoder.layers.12.self_attn.out_proj.weight', 'vision_model.encoder.layers.20.self_attn.k_proj.bias', 'vision_model.encoder.layers.20.self_attn.v_proj.weight', 'vision_model.encoder.layers.21.self_attn.v_proj.weight', 'vision_model.encoder.layers.14.self_attn.q_proj.weight', 'vision_model.encoder.layers.12.mlp.fc1.weight', 'vision_model.encoder.layers.7.mlp.fc2.weight', 'vision_model.encoder.layers.13.mlp.fc2.bias', 'vision_model.encoder.layers.5.mlp.fc2.weight', 'vision_model.encoder.layers.18.self_attn.v_proj.weight', 'vision_model.encoder.layers.13.self_attn.v_proj.weight', 'vision_model.encoder.layers.20.layer_norm2.weight', 'vision_model.encoder.layers.1.mlp.fc2.weight', 'vision_model.encoder.layers.10.mlp.fc1.weight', 'vision_model.encoder.layers.3.self_attn.out_proj.bias', 'vision_model.encoder.layers.8.self_attn.q_proj.weight', 'vision_model.encoder.layers.4.self_attn.v_proj.weight', 'vision_model.encoder.layers.5.mlp.fc2.bias', 'vision_model.encoder.layers.0.mlp.fc1.weight', 'vision_model.encoder.layers.1.layer_norm2.bias', 'vision_model.encoder.layers.13.self_attn.v_proj.bias', 'vision_model.encoder.layers.21.mlp.fc2.bias', 'vision_model.encoder.layers.4.self_attn.k_proj.bias', 'vision_model.encoder.layers.23.self_attn.q_proj.weight', 'vision_model.encoder.layers.13.self_attn.out_proj.weight', 'vision_model.encoder.layers.14.mlp.fc1.weight', 'vision_model.encoder.layers.7.self_attn.q_proj.weight', 'vision_model.encoder.layers.15.layer_norm1.weight', 'vision_model.encoder.layers.22.self_attn.k_proj.bias', 'vision_model.encoder.layers.22.self_attn.q_proj.bias', 'vision_model.encoder.layers.17.layer_norm1.bias', 'vision_model.encoder.layers.13.mlp.fc2.weight', 'vision_model.encoder.layers.4.mlp.fc2.weight', 'vision_model.encoder.layers.5.self_attn.k_proj.bias', 'vision_model.encoder.layers.10.self_attn.out_proj.bias', 'vision_model.encoder.layers.11.self_attn.v_proj.bias', 'vision_model.encoder.layers.11.self_attn.k_proj.bias', 'vision_model.encoder.layers.16.layer_norm1.weight', 'vision_model.encoder.layers.21.mlp.fc1.bias', 'vision_model.encoder.layers.15.mlp.fc2.weight', 'vision_model.encoder.layers.18.layer_norm2.weight', 'vision_model.encoder.layers.18.self_attn.v_proj.bias', 'vision_model.encoder.layers.18.mlp.fc1.bias', 'vision_model.encoder.layers.9.self_attn.k_proj.bias', 'vision_model.encoder.layers.8.self_attn.v_proj.bias', 'vision_model.encoder.layers.6.self_attn.v_proj.weight', 'vision_model.encoder.layers.4.mlp.fc1.bias', 'vision_model.encoder.layers.14.self_attn.v_proj.weight', 'vision_model.encoder.layers.4.mlp.fc2.bias', 'vision_model.encoder.layers.18.self_attn.k_proj.weight', 'vision_model.encoder.layers.1.layer_norm1.weight', 'vision_model.encoder.layers.21.mlp.fc2.weight', 'vision_model.encoder.layers.20.mlp.fc2.bias', 'vision_model.encoder.layers.12.mlp.fc2.bias', 'vision_model.encoder.layers.21.self_attn.out_proj.weight', 'vision_model.encoder.layers.0.self_attn.out_proj.weight', 'vision_model.encoder.layers.13.layer_norm2.bias', 'vision_model.encoder.layers.18.mlp.fc2.bias', 'vision_model.encoder.layers.0.mlp.fc1.bias', 'vision_model.encoder.layers.15.self_attn.q_proj.weight', 'vision_model.encoder.layers.18.self_attn.q_proj.bias', 'vision_model.encoder.layers.1.self_attn.out_proj.bias', 'vision_model.encoder.layers.10.self_attn.k_proj.bias', 'vision_model.encoder.layers.23.mlp.fc1.weight', 'vision_model.encoder.layers.0.self_attn.k_proj.bias', 'vision_model.encoder.layers.11.layer_norm2.weight', 'vision_model.encoder.layers.1.self_attn.q_proj.weight', 'vision_model.embeddings.patch_embedding.weight', 'vision_model.encoder.layers.8.layer_norm1.bias', 'vision_model.encoder.layers.11.layer_norm1.weight', 'vision_model.encoder.layers.11.mlp.fc1.weight', 'vision_model.encoder.layers.6.layer_norm1.bias', 'vision_model.encoder.layers.19.layer_norm2.weight', 'vision_model.encoder.layers.2.self_attn.k_proj.weight', 'vision_model.encoder.layers.14.self_attn.out_proj.bias', 'vision_model.encoder.layers.16.mlp.fc1.weight', 'vision_model.encoder.layers.5.layer_norm2.bias', 'vision_model.encoder.layers.23.self_attn.out_proj.weight', 'vision_model.encoder.layers.1.mlp.fc1.weight', 'vision_model.encoder.layers.19.self_attn.q_proj.bias', 'vision_model.encoder.layers.0.self_attn.v_proj.bias', 'vision_model.encoder.layers.15.mlp.fc2.bias', 'vision_model.encoder.layers.18.mlp.fc2.weight', 'vision_model.encoder.layers.10.layer_norm2.weight', 'vision_model.pre_layrnorm.weight', 'vision_model.encoder.layers.6.layer_norm1.weight', 'vision_model.encoder.layers.1.self_attn.k_proj.weight', 'vision_model.encoder.layers.18.self_attn.q_proj.weight', 'vision_model.encoder.layers.10.mlp.fc1.bias', 'vision_model.encoder.layers.15.mlp.fc1.weight', 'vision_model.encoder.layers.23.self_attn.k_proj.bias', 'vision_model.encoder.layers.15.self_attn.out_proj.bias', 'vision_model.encoder.layers.17.layer_norm2.bias', 'vision_model.encoder.layers.5.self_attn.out_proj.bias', 'vision_model.encoder.layers.20.mlp.fc1.bias', 'vision_model.encoder.layers.11.self_attn.out_proj.bias', 'vision_model.pre_layrnorm.bias', 'vision_model.encoder.layers.19.self_attn.out_proj.weight', 'vision_model.encoder.layers.17.self_attn.out_proj.bias', 'vision_model.encoder.layers.5.self_attn.out_proj.weight', 'vision_model.encoder.layers.5.self_attn.k_proj.weight', 'vision_model.encoder.layers.22.layer_norm1.bias', 'vision_model.encoder.layers.8.mlp.fc1.bias', 'vision_model.encoder.layers.0.self_attn.q_proj.bias', 'vision_model.encoder.layers.12.self_attn.k_proj.weight', 'vision_model.encoder.layers.10.self_attn.q_proj.weight', 'vision_model.post_layernorm.bias', 'vision_model.encoder.layers.14.layer_norm1.bias', 'vision_model.encoder.layers.3.self_attn.q_proj.weight', 'vision_model.encoder.layers.9.mlp.fc2.bias', 'vision_model.encoder.layers.16.self_attn.v_proj.weight', 'vision_model.encoder.layers.0.self_attn.v_proj.weight', 'vision_model.encoder.layers.11.self_attn.out_proj.weight', 'vision_model.encoder.layers.3.layer_norm2.weight', 'vision_model.encoder.layers.17.mlp.fc2.bias', 'vision_model.encoder.layers.2.mlp.fc2.weight', 'vision_model.encoder.layers.11.self_attn.k_proj.weight', 'vision_model.encoder.layers.3.self_attn.v_proj.bias', 'vision_model.encoder.layers.0.mlp.fc2.weight', 'vision_model.encoder.layers.13.mlp.fc1.weight', 'vision_model.encoder.layers.21.layer_norm2.bias', 'vision_model.encoder.layers.0.self_attn.k_proj.weight', 'vision_model.encoder.layers.8.self_attn.k_proj.weight', 'vision_model.encoder.layers.13.self_attn.k_proj.weight', 'vision_model.encoder.layers.20.self_attn.k_proj.weight', 'vision_model.encoder.layers.11.self_attn.v_proj.weight', 'vision_model.encoder.layers.12.layer_norm1.bias', 'vision_model.encoder.layers.9.layer_norm2.bias', 'vision_model.encoder.layers.7.layer_norm1.bias', 'vision_model.encoder.layers.20.self_attn.out_proj.bias', 'vision_model.encoder.layers.14.layer_norm1.weight', 'vision_model.encoder.layers.9.layer_norm1.bias', 'vision_model.encoder.layers.1.self_attn.q_proj.bias', 'vision_model.encoder.layers.2.layer_norm2.bias', 'vision_model.encoder.layers.22.self_attn.k_proj.weight', 'vision_model.encoder.layers.4.mlp.fc1.weight', 'vision_model.post_layernorm.weight', 'vision_model.encoder.layers.9.mlp.fc1.weight', 'vision_model.encoder.layers.17.self_attn.k_proj.weight', 'vision_model.encoder.layers.21.self_attn.q_proj.weight', 'vision_model.encoder.layers.1.self_attn.v_proj.weight', 'logit_scale', 'vision_model.encoder.layers.9.self_attn.k_proj.weight', 'vision_model.encoder.layers.18.self_attn.out_proj.bias', 'vision_model.encoder.layers.10.self_attn.v_proj.weight', 'vision_model.encoder.layers.23.self_attn.v_proj.bias', 'vision_model.encoder.layers.2.self_attn.q_proj.weight', 'vision_model.encoder.layers.2.self_attn.v_proj.bias', 'vision_model.encoder.layers.11.mlp.fc2.bias', 'vision_model.encoder.layers.9.self_attn.q_proj.weight', 'vision_model.encoder.layers.16.self_attn.q_proj.bias', 'vision_model.encoder.layers.22.layer_norm2.weight', 'vision_model.encoder.layers.6.mlp.fc1.weight', 'vision_model.encoder.layers.6.self_attn.k_proj.weight', 'vision_model.encoder.layers.13.layer_norm1.bias', 'vision_model.encoder.layers.20.self_attn.out_proj.weight', 'vision_model.encoder.layers.7.mlp.fc2.bias', 'vision_model.encoder.layers.11.self_attn.q_proj.bias', 'vision_model.encoder.layers.8.self_attn.out_proj.weight', 'vision_model.encoder.layers.0.self_attn.out_proj.bias', 'vision_model.encoder.layers.10.layer_norm1.weight', 'vision_model.encoder.layers.7.layer_norm2.bias', 'vision_model.encoder.layers.12.self_attn.k_proj.bias', 'vision_model.encoder.layers.11.layer_norm1.bias', 'vision_model.encoder.layers.19.self_attn.v_proj.bias', 'vision_model.encoder.layers.3.layer_norm1.bias', 'vision_model.encoder.layers.3.layer_norm1.weight', 'vision_model.encoder.layers.5.layer_norm2.weight', 'vision_model.encoder.layers.14.layer_norm2.weight', 'vision_model.encoder.layers.4.self_attn.k_proj.weight', 'vision_model.encoder.layers.9.self_attn.v_proj.weight', 'vision_model.encoder.layers.17.self_attn.v_proj.bias', 'vision_model.encoder.layers.0.mlp.fc2.bias', 'vision_model.encoder.layers.3.self_attn.k_proj.bias', 'vision_model.encoder.layers.17.self_attn.q_proj.weight', 'vision_model.encoder.layers.15.layer_norm1.bias', 'vision_model.encoder.layers.9.mlp.fc1.bias', 'vision_model.encoder.layers.3.mlp.fc2.bias', 'vision_model.encoder.layers.3.mlp.fc2.weight', 'vision_model.encoder.layers.0.layer_norm1.bias', 'vision_model.encoder.layers.22.layer_norm2.bias', 'vision_model.encoder.layers.4.self_attn.q_proj.bias', 'vision_model.encoder.layers.4.layer_norm2.bias', 'vision_model.encoder.layers.13.self_attn.q_proj.bias', 'vision_model.encoder.layers.23.mlp.fc2.bias', 'vision_model.embeddings.position_ids', 'vision_model.encoder.layers.19.self_attn.q_proj.weight', 'vision_model.encoder.layers.5.self_attn.v_proj.bias', 'vision_model.encoder.layers.15.layer_norm2.bias', 'vision_model.encoder.layers.13.self_attn.q_proj.weight', 'vision_model.encoder.layers.22.mlp.fc1.weight', 'vision_model.encoder.layers.13.layer_norm2.weight', 'vision_model.encoder.layers.2.mlp.fc1.weight', 'vision_model.encoder.layers.15.self_attn.q_proj.bias', 'vision_model.encoder.layers.5.mlp.fc1.bias', 'vision_model.encoder.layers.13.layer_norm1.weight', 'vision_model.encoder.layers.14.self_attn.q_proj.bias', 'vision_model.encoder.layers.16.self_attn.k_proj.weight', 'vision_model.encoder.layers.7.self_attn.k_proj.bias', 'vision_model.encoder.layers.14.mlp.fc1.bias', 'vision_model.encoder.layers.17.self_attn.v_proj.weight', 'vision_model.encoder.layers.2.self_attn.v_proj.weight', 'vision_model.encoder.layers.21.layer_norm2.weight', 'vision_model.encoder.layers.7.self_attn.out_proj.bias', 'vision_model.encoder.layers.14.self_attn.v_proj.bias', 'vision_model.encoder.layers.6.self_attn.v_proj.bias', 'vision_model.encoder.layers.23.layer_norm2.bias', 'vision_model.encoder.layers.22.self_attn.v_proj.weight', 'vision_model.encoder.layers.2.self_attn.out_proj.bias', 'vision_model.embeddings.class_embedding', 'vision_model.embeddings.position_embedding.weight', 'vision_model.encoder.layers.18.self_attn.out_proj.weight', 'vision_model.encoder.layers.14.self_attn.k_proj.weight', 'vision_model.encoder.layers.2.layer_norm1.weight', 'vision_model.encoder.layers.6.mlp.fc2.bias', 'vision_model.encoder.layers.21.layer_norm1.bias', 'vision_model.encoder.layers.1.self_attn.out_proj.weight', 'vision_model.encoder.layers.8.layer_norm2.weight', 'vision_model.encoder.layers.1.self_attn.v_proj.bias', 'vision_model.encoder.layers.18.layer_norm1.weight', 'vision_model.encoder.layers.21.self_attn.out_proj.bias', 'vision_model.encoder.layers.23.layer_norm1.bias', 'vision_model.encoder.layers.11.mlp.fc2.weight', 'vision_model.encoder.layers.12.layer_norm2.weight', 'vision_model.encoder.layers.9.self_attn.out_proj.weight', 'vision_model.encoder.layers.20.layer_norm2.bias', 'vision_model.encoder.layers.2.layer_norm2.weight', 'vision_model.encoder.layers.8.self_attn.v_proj.weight', 'vision_model.encoder.layers.21.self_attn.q_proj.bias', 'vision_model.encoder.layers.15.self_attn.out_proj.weight', 'vision_model.encoder.layers.20.layer_norm1.weight', 'vision_model.encoder.layers.20.mlp.fc2.weight', 'vision_model.encoder.layers.17.mlp.fc1.weight', 'vision_model.encoder.layers.4.self_attn.out_proj.weight', 'vision_model.encoder.layers.22.self_attn.v_proj.bias', 'vision_model.encoder.layers.17.self_attn.k_proj.bias', 'vision_model.encoder.layers.16.layer_norm2.bias', 'vision_model.encoder.layers.12.self_attn.q_proj.weight', 'vision_model.encoder.layers.5.layer_norm1.weight', 'vision_model.encoder.layers.22.self_attn.q_proj.weight', 'vision_model.encoder.layers.7.mlp.fc1.weight', 'vision_model.encoder.layers.19.mlp.fc1.bias', 'vision_model.encoder.layers.22.mlp.fc1.bias', 'vision_model.encoder.layers.6.layer_norm2.weight', 'vision_model.encoder.layers.14.self_attn.out_proj.weight', 'vision_model.encoder.layers.20.mlp.fc1.weight', 'vision_model.encoder.layers.10.layer_norm2.bias', 'vision_model.encoder.layers.21.self_attn.v_proj.bias', 'vision_model.encoder.layers.19.layer_norm1.weight', 'vision_model.encoder.layers.18.mlp.fc1.weight', 'vision_model.encoder.layers.23.self_attn.k_proj.weight', 'vision_model.encoder.layers.1.mlp.fc2.bias', 'vision_model.encoder.layers.22.mlp.fc2.weight', 'vision_model.encoder.layers.23.mlp.fc2.weight', 'vision_model.encoder.layers.5.self_attn.q_proj.weight', 'vision_model.encoder.layers.16.self_attn.q_proj.weight', 'vision_model.encoder.layers.1.layer_norm2.weight', 'vision_model.encoder.layers.21.self_attn.k_proj.bias', 'vision_model.encoder.layers.10.self_attn.v_proj.bias', 'vision_model.encoder.layers.16.self_attn.k_proj.bias', 'vision_model.encoder.layers.3.mlp.fc1.bias', 'vision_model.encoder.layers.15.layer_norm2.weight', 'vision_model.encoder.layers.17.layer_norm2.weight', 'vision_model.encoder.layers.6.mlp.fc2.weight', 'vision_model.encoder.layers.8.self_attn.k_proj.bias', 'vision_model.encoder.layers.19.self_attn.k_proj.weight', 'vision_model.encoder.layers.16.layer_norm1.bias', 'vision_model.encoder.layers.8.layer_norm1.weight', 'vision_model.encoder.layers.9.layer_norm2.weight', 'vision_model.encoder.layers.22.layer_norm1.weight', 'vision_model.encoder.layers.6.self_attn.out_proj.bias', 'vision_model.encoder.layers.4.layer_norm2.weight', 'vision_model.encoder.layers.3.self_attn.q_proj.bias', 'vision_model.encoder.layers.4.layer_norm1.weight', 'vision_model.encoder.layers.6.self_attn.k_proj.bias', 'vision_model.encoder.layers.8.mlp.fc1.weight', 'vision_model.encoder.layers.7.self_attn.k_proj.weight', 'vision_model.encoder.layers.16.mlp.fc1.bias', 'vision_model.encoder.layers.0.layer_norm2.weight', 'vision_model.encoder.layers.21.self_attn.k_proj.weight', 'vision_model.encoder.layers.4.self_attn.out_proj.bias', 'vision_model.encoder.layers.0.self_attn.q_proj.weight', 'vision_model.encoder.layers.18.layer_norm1.bias', 'vision_model.encoder.layers.18.self_attn.k_proj.bias', 'vision_model.encoder.layers.15.mlp.fc1.bias', 'vision_model.encoder.layers.10.layer_norm1.bias', 'vision_model.encoder.layers.22.mlp.fc2.bias', 'vision_model.encoder.layers.19.layer_norm1.bias', 'vision_model.encoder.layers.8.layer_norm2.bias', 'text_projection.weight', 'vision_model.encoder.layers.2.self_attn.q_proj.bias', 'vision_model.encoder.layers.6.self_attn.q_proj.bias', 'vision_model.encoder.layers.11.self_attn.q_proj.weight', 'vision_model.encoder.layers.9.layer_norm1.weight', 'vision_model.encoder.layers.14.mlp.fc2.weight', 'vision_model.encoder.layers.17.self_attn.q_proj.bias', 'vision_model.encoder.layers.7.self_attn.out_proj.weight', 'vision_model.encoder.layers.10.mlp.fc2.bias', 'vision_model.encoder.layers.23.self_attn.v_proj.weight', 'vision_model.encoder.layers.6.mlp.fc1.bias', 'vision_model.encoder.layers.19.self_attn.k_proj.bias', 'vision_model.encoder.layers.23.layer_norm1.weight', 'vision_model.encoder.layers.20.layer_norm1.bias', 'vision_model.encoder.layers.7.self_attn.q_proj.bias', 'vision_model.encoder.layers.1.layer_norm1.bias', 'vision_model.encoder.layers.6.layer_norm2.bias', 'vision_model.encoder.layers.0.layer_norm2.bias']

  • This IS expected if you are initializing CLIPTextModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
  • This IS NOT expected if you are initializing CLIPTextModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). loading text encoder: <All keys matched successfully> Replace CrossAttention.forward to use xformers caching latents. 100%|██████████████████████████████████████████████████████████████████████████████████| 11/11 [00:05<00:00, 1.84it/s] import network module: networks.lora create LoRA for Text Encoder: 72 modules. create LoRA for U-Net: 192 modules. enable LoRA for text encoder enable LoRA for U-Net prepare optimizer, data loader etc.

===================================BUG REPORT=================================== Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues For effortless bug reporting copy-paste your error into this form: https://docs.google.com/forms/d/e/1FAIpQLScPB8emS3Thkp66nvqwmjTEgxp8Y9ufuWTzFyr9kJ5AoI47dQ/viewform?usp=sf_link

CUDA SETUP: Loading binary C:\Users.....\Documents\kohya\kohya_ss\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cuda116.dll... use 8-bit Adam optimizer running training / 学習開始 num train images * repeats / 学習画像の数×繰り返し回数: 1100 num reg images / 正則化画像の数: 0 num batches per epoch / 1epochのバッチ数: 550 num epochs / epoch数: 1 batch size per device / バッチサイズ: 2 total train batch size (with parallel & distributed & accumulation) / 総バッチサイズ(並列学習、勾配合計含む): 2 gradient accumulation steps / 勾配を合計するステップ数 = 1 total optimization steps / 学習ステップ数: 550 Traceback (most recent call last): File "C:\Users.....\Documents\kohya\kohya_ss\train_network.py", line 573, in train(args) File "C:\Users.....\Documents\kohya\kohya_ss\train_network.py", line 356, in train "ss_noise_offset": args.noise_offset, AttributeError: 'Namespace' object has no attribute 'noise_offset' Traceback (most recent call last): File "C:\Users.....\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\basil\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in run_code exec(code, run_globals) File "C:\Users.....\Documents\kohya\kohya_ss\venv\Scripts\accelerate.exe_main.py", line 7, in File "C:\Users.....\Documents\kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main args.func(args) File "C:\Users.....l\Documents\kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command simple_launcher(args) File "C:\Users\basil\Documents\kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['C:\Users\basil\Documents\kohya\kohya_ss\venv\Scripts\python.exe', 'train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=C:/Apps/AI/stable-diffusion-webui/models/Stable-diffusion/realisticVisionV13_v13VAEIncluded.safetensors', '--train_data_dir=C:/Users/basil/Documents/LORA Training Data/Test/image', '--resolution=768,768', '--output_dir=C:/Users/basil/Documents/LORA Training Data/Test/model', '--logging_dir=C:/Users/basil/Documents/LORA Training Data/Test/log', '--network_alpha=1', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-5', '--unet_lr=0.0001', '--network_dim=8', '--output_name=last', '--lr_scheduler_num_cycles=1', '--learning_rate=0.0001', '--lr_scheduler=cosine', '--lr_warmup_steps=55', '--train_batch_size=2', '--max_train_steps=550', '--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--seed=1234', '--cache_latents', '--bucket_reso_steps=64', '--xformers', '--use_8bit_adam', '--bucket_no_upscale']' returned non-zero exit status 1.

Crimsonfart avatar Feb 18 '23 06:02 Crimsonfart

Yeah, if you update (git pull) today, then you can not train model any more. you will receive error logs like this: line 45, in main args.func(args) line 1104, in launch_command line 567, in simple_launcher

Rika-Mipa avatar Feb 18 '23 07:02 Rika-Mipa

just revert to previsious commin

Nyaster avatar Feb 18 '23 07:02 Nyaster

Alternatively, just replace library/train_util.py with kohya's new version https://github.com/kohya-ss/sd-scripts/blob/main/library/train_util.py

martianunlimited avatar Feb 18 '23 09:02 martianunlimited

Alternatively, just replace library/train_util.py with kohya's new version https://github.com/kohya-ss/sd-scripts/blob/main/library/train_util.py

the software can work after replace train.util thank you for your help

Rika-Mipa avatar Feb 18 '23 10:02 Rika-Mipa

I was able to resolve this issue by merging the dev branch into the master branch. This commit 641a168e55f429c79f9114bcdb123a13bc9b2167 resolved it for me and was probably forgotten.

Thund3rPat avatar Feb 18 '23 17:02 Thund3rPat

Alternatively, just replace library/train_util.py with kohya's new version https://github.com/kohya-ss/sd-scripts/blob/main/library/train_util.py

fixed for me too, ty

reduxo1 avatar Feb 18 '23 18:02 reduxo1

AttributeError: 'Namespace' object has no attribute 'noise_offset'

Same problem here, and replace train_util.py is not a solution

MalpoDeMalpis avatar Feb 19 '23 11:02 MalpoDeMalpis

It's not fixed the problem at all :( . Is someone have an other idea please?

I replaced library/train_util.py with kohya's new version https://github.com/kohya-ss/sd-scripts/blob/main/library/train_util.py

CUDA SETUP: Loading binary C:\Users\Utilisateur\Documents\Kohya\kohya_ss\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cuda116.dll... use 8-bit Adam optimizer running training / 学習開始 num train images * repeats / 学習画像の数×繰り返し回数: 1700 num reg images / 正則化画像の数: 0 num batches per epoch / 1epochのバッチ数: 850 num epochs / epoch数: 1 batch size per device / バッチサイズ: 2 total train batch size (with parallel & distributed & accumulation) / 総バッチサイズ(並列学習、勾配合計含む): 2 gradient accumulation steps / 勾配を合計するステップ数 = 1 total optimization steps / 学習ステップ数: 850 steps: 0%| | 0/850 [00:00<?, ?it/s]epoch 1/1 C:\Users\Utilisateur\Documents\Kohya\kohya_ss\venv\lib\site-packages\torch\utils\checkpoint.py:25: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn("None of the inputs have requires_grad=True. Gradients will be None") Error no kernel image is available for execution on the device at line 167 in file D:\ai\tool\bitsandbytes\csrc\ops.cu Traceback (most recent call last): File "C:\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Python310\lib\runpy.py", line 86, in run_code exec(code, run_globals) File "C:\Users\Utilisateur\Documents\Kohya\kohya_ss\venv\Scripts\accelerate.exe_main.py", line 7, in File "C:\Users\Utilisateur\Documents\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main args.func(args) File "C:\Users\Utilisateur\Documents\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command simple_launcher(args) File "C:\Users\Utilisateur\Documents\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['C:\Users\Utilisateur\Documents\Kohya\kohya_ss\venv\Scripts\python.exe', 'train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=//UTILISATEUR-PC/Users/Utilisateur/stable-diffusion-webui/models/Stable-diffusion/realisticVisionV13_v13.safetensors', '--train_data_dir=C:\Users\Utilisateur\Documents\Lora TRaining DAta\test\image', '--resolution=512,512', '--output_dir=C:\Users\Utilisateur\Documents\Lora TRaining DAta\test\model', '--logging_dir=C:\Users\Utilisateur\Documents\Lora TRaining DAta\test\log', '--network_alpha=1', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-5', '--unet_lr=0.0001', '--network_dim=8', '--output_name=last', '--lr_scheduler_num_cycles=1', '--learning_rate=0.0001', '--lr_scheduler=cosine', '--lr_warmup_steps=85', '--train_batch_size=2', '--max_train_steps=850', '--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--seed=1234', '--cache_latents', '--bucket_reso_steps=64', '--mem_eff_attn', '--gradient_checkpointing', '--xformers', '--use_8bit_adam', '--bucket_no_upscale']' returned non-zero exit status 1.

Maranpani avatar Feb 19 '23 14:02 Maranpani

I think you need to replace all 3 occurances of util_train.py in the kohya_ss folder. Then it worked for me, even after the update.

I think you need to replace all 3 occurances of util_train.py in the kohya_ss folder. Then it worked for me, even after the update.

Replace them with what? There is only one instance of train_util and thats in the library folder

tpcdaz avatar Feb 20 '23 00:02 tpcdaz

Could you do a tuto video or ask someone to do it ^^.

COuld you explain exactly what do you mean about : "replace all 3 occurances of util_train.py in the kohya_ss folder"

Maranpani avatar Feb 20 '23 03:02 Maranpani

I think you need to replace all 3 occurances of util_train.py in the kohya_ss folder. Then it worked for me, even after the update.

Replace them with what? There is only one instance of train_util and thats in the library folder

\kohya_ss\library\train_util.py \kohya_ss\venv\Lib\site-packages\library\train_util.py \kohya_ss\build\lib\library\train_util.py

starpause avatar Feb 20 '23 04:02 starpause

tried all of the above, nothing worked. Then I changed optimizer from Adam to Lion, and now it's working. I have no idea what that changes in terms of quality, etc. But hey at least I'm unstuck

noobNerd1 avatar Feb 20 '23 10:02 noobNerd1

I was able to resolve this issue by merging the dev branch into the master branch. This commit 641a168e55f429c79f9114bcdb123a13bc9b2167 resolved it for me and was probably forgotten.

And could you explain for a newbie what does it means and how doing it step by step please ?

Maranpani avatar Feb 20 '23 11:02 Maranpani

@Maranpani With the last update it is now merged. No need to do it yourself anymore.

Thund3rPat avatar Feb 20 '23 11:02 Thund3rPat

@Maranpani With the last update it is now merged. No need to do it yourself anymore.

Hello could you be more sharp ?

which update Udapte of what ? How to do the update ? Thanks

Maranpani avatar Feb 20 '23 11:02 Maranpani

Hi, of course. To update to the newest release, open the terminal and go to the kohya_ss folder. Execute the upgrade script with:

.\upgrade.ps1

I hope this helps.

Thund3rPat avatar Feb 20 '23 12:02 Thund3rPat

I think you need to replace all 3 occurances of util_train.py in the kohya_ss folder. Then it worked for me, even after the update.

Replace them with what? There is only one instance of train_util and thats in the library folder

\kohya_ss\library\train_util.py \kohya_ss\venv\Lib\site-packages\library\train_util.py \kohya_ss\build\lib\library\train_util.py

**It's not working for me

ALways CUDA error:**

CUDA SETUP: Loading binary C:\Users\Utilisateur\Documents\Kohya\kohya_ss\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cuda116.dll... use 8-bit Adam optimizer running training / 学習開始 num train images * repeats / 学習画像の数×繰り返し回数: 1700 num reg images / 正則化画像の数: 0 num batches per epoch / 1epochのバッチ数: 850 num epochs / epoch数: 1 batch size per device / バッチサイズ: 2 total train batch size (with parallel & distributed & accumulation) / 総バッチサイズ(並列学習、勾配合計含む): 2 gradient accumulation steps / 勾配を合計するステップ数 = 1 total optimization steps / 学習ステップ数: 850 steps: 0%| | 0/850 [00:00<?, ?it/s]epoch 1/1 C:\Users\Utilisateur\Documents\Kohya\kohya_ss\venv\lib\site-packages\torch\utils\checkpoint.py:25: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn("None of the inputs have requires_grad=True. Gradients will be None") Error no kernel image is available for execution on the device at line 167 in file D:\ai\tool\bitsandbytes\csrc\ops.cu Traceback (most recent call last): File "C:\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Python310\lib\runpy.py", line 86, in run_code exec(code, run_globals) File "C:\Users\Utilisateur\Documents\Kohya\kohya_ss\venv\Scripts\accelerate.exe_main.py", line 7, in File "C:\Users\Utilisateur\Documents\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main args.func(args) File "C:\Users\Utilisateur\Documents\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command simple_launcher(args) File "C:\Users\Utilisateur\Documents\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['C:\Users\Utilisateur\Documents\Kohya\kohya_ss\venv\Scripts\python.exe', 'train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=//UTILISATEUR-PC/Users/Utilisateur/stable-diffusion-webui/models/Stable-diffusion/realisticVisionV13_v13.safetensors', '--train_data_dir=C:\Users\Utilisateur\Documents\Lora TRaining DAta\test\image', '--resolution=512,512', '--output_dir=C:\Users\Utilisateur\Documents\Lora TRaining DAta\test\model', '--logging_dir=C:\Users\Utilisateur\Documents\Lora TRaining DAta\test\log', '--network_alpha=1', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-5', '--unet_lr=0.0001', '--network_dim=8', '--output_name=last', '--lr_scheduler_num_cycles=1', '--learning_rate=0.0001', '--lr_scheduler=cosine', '--lr_warmup_steps=85', '--train_batch_size=2', '--max_train_steps=850', '--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--seed=1234', '--cache_latents', '--bucket_reso_steps=64', '--mem_eff_attn', '--gradient_checkpointing', '--xformers', '--use_8bit_adam', '--bucket_no_upscale']' returned non-zero exit status 1.

Maranpani avatar Feb 20 '23 12:02 Maranpani

@Maranpani Can you deactivate 8bit adam and try again?

Thund3rPat avatar Feb 20 '23 12:02 Thund3rPat

WHen i click to execute, a windows appear and all become automatic then close. I don't have the possibility to write anything. Is i misunderstood something ?

For information i follow the tutorial of Olivier SArikas on youtube who said to put it in the "upgrade" folder So For now that is writed in my folder is :

git pull .\venv\Scripts\activate pip install --upgrade -r requirements.txt

What should i replace, all ? What do you mean by exécute ? Just write something and save?

Maranpani avatar Feb 20 '23 12:02 Maranpani

@Maranpani Can you deactivate 8bit adam and try again? thanks

It WORKS 👍 when we remove Use 8bit adam

So if it's working without 8bit adam few questions

  1. Are you able to fix the issue at the origins to help everyone with using 8bit adam ?
  2. What is exactly "8bit adam" ? what happens if we don' use it, versus if we use it ?
  3. For information , here my actual advenced parameters settings now (without using 8bit adam as you told me). Gradient checkpointing ON Shuffle caption OFF Persistent data loader OFF Memory efficient attention ON Use 8bit adam OFF Use xformers ON Color augmentation OFF Flip augmentation OFF Don't upscale bucket resolutio OFF

Maranpani avatar Feb 20 '23 15:02 Maranpani

train_network.py: error: unrecognized arguments: 768 Traceback (most recent call last): File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.2800.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.2800.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 86, in run_code exec(code, run_globals) File "G:\LoRA\kohya_ss\venv\Scripts\accelerate.exe_main.py", line 7, in File "G:\LoRA\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main args.func(args) File "G:\LoRA\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command simple_launcher(args) File "G:\LoRA\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['G:\LoRA\kohya_ss\venv\Scripts\python.exe', 'train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=G:/novelai-webui-aki/models/Stable-diffusion/Anything3.ckpt', '--train_data_dir=G:/LoRA/99a', '--resolution=640,', '768', '--output_dir=G:/LoRA/99a', '--logging_dir=', '--network_alpha=1', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=1.5e-5', '--unet_lr=1.5e-4', '--network_dim=128', '--output_name=99A', '--lr_scheduler_num_cycles=5', '--learning_rate=0.0001', '--lr_scheduler=constant_with_warmup', '--lr_warmup_steps=13', '--train_batch_size=3', '--max_train_steps=269', '--save_every_n_epochs=5', '--mixed_precision=fp16', '--save_precision=fp16', '--seed=31337', '--cache_latents', '--use_lion_optimizer', '--clip_skip=2', '--bucket_reso_steps=64', '--shuffle_caption', '--xformers', '--use_8bit_adam', '--bucket_no_upscale']' returned non-zero exit status 2.

Error still reported after replacing train_util.py。

FlareWaterLily avatar Feb 21 '23 10:02 FlareWaterLily

@FlareWaterLily The error message says that you have set the resolution parameter wrong. Can you check again and set it like this:

640,768

Thund3rPat avatar Feb 21 '23 10:02 Thund3rPat

I have same error and i untick "use 8bit adam" and choose Optimizer other than AdamW8bit then it starts to work

kithungsam avatar Mar 01 '23 08:03 kithungsam

nothing worked, i tried everything. i still get the same error :(

dokkkku avatar Mar 02 '23 19:03 dokkkku

nothing worked, i tried everything. i still get the same error :(

Unactivate 8bit adams

Maranpani avatar Mar 02 '23 19:03 Maranpani

still didnt work

dokkkku avatar Mar 03 '23 09:03 dokkkku

i tried everything:( Traceback (most recent call last): File "F:\lora-scripts-0.2.0\sd-scripts\train_network.py", line 642, in train(args) File "F:\lora-scripts-0.2.0\sd-scripts\train_network.py", line 114, in train text_encoder, vae, unet, _ = train_util.load_target_model(args, weight_dtype) File "F:\lora-scripts-0.2.0\sd-scripts\library\train_util.py", line 2016, in load_target_model text_encoder, vae, unet = model_util.load_models_from_stable_diffusion_checkpoint(args.v2, name_or_path) File "F:\lora-scripts-0.2.0\sd-scripts\library\model_util.py", line 877, in load_models_from_stable_diffusion_checkpoint converted_unet_checkpoint = convert_ldm_unet_checkpoint(v2, state_dict, unet_config) File "F:\lora-scripts-0.2.0\sd-scripts\library\model_util.py", line 234, in convert_ldm_unet_checkpoint new_checkpoint["time_embedding.linear_1.weight"] = unet_state_dict["time_embed.0.weight"] KeyError: 'time_embed.0.weight' Traceback (most recent call last): File "C:\Users\keina\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\keina\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in run_code exec(code, run_globals) File "F:\lora-scripts-0.2.0\venv\Scripts\accelerate.exe_main.py", line 7, in File "F:\lora-scripts-0.2.0\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main args.func(args) File "F:\lora-scripts-0.2.0\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command simple_launcher(args) File "F:\lora-scripts-0.2.0\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['F:\lora-scripts-0.2.0\venv\Scripts\python.exe', './sd-scripts/train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=./sd-models/before.safetensors', '--train_data_dir=./train', '--output_dir=./output', '--logging_dir=./logs', '--resolution=512,768', '--network_module=networks.lora', '--max_train_epochs=20', '--learning_rate=1e-4', '--unet_lr=1e-4', '--text_encoder_lr=1e-5', '--lr_scheduler=cosine_with_restarts', '--lr_warmup_steps=0', '--lr_scheduler_num_cycles=1', '--network_dim=64', '--network_alpha=32', '--output_name=after', '--train_batch_size=3', '--save_every_n_epochs=2', '--mixed_precision=fp16', '--save_precision=fp16', '--seed=1337', '--cache_latents', '--clip_skip=2', '--prior_loss_weight=1', '--max_token_length=225', '--caption_extension=.txt', '--save_model_as=safetensors', '--min_bucket_reso=256', '--max_bucket_reso=1024', '--xformers', '--shuffle_caption', '--reg_data_dir=./train/reg', '--use_8bit_adam', '--use_lion_optimizer']' returned non-zero exit status 1. Train finished

NakiriKajiya avatar Mar 09 '23 18:03 NakiriKajiya

我尝试了一切:(回溯(最近一次调用):文件“F:\lora-scripts-0.2.0\sd-scripts\train_network.py”,第 642 行,在火车(args) 文件“F:\lora-scripts-0.2.0\sd-scripts\train_network.py”,第 114 行,在火车text_encoder中,VAE,unet,_ = train_util.load_target_model(args,weight_dtype) 文件“F:\lora-scripts-0.2.0\sd-scripts\library\train_util.py”,第 2016 行,load_target_model text_encoder, vae, unet = model_util.load_models_from_stable_diffusion_checkpoint(args.v2, name_or_path) 文件 “F:\lora-scripts-0.2.0\sd-scripts\library\model_util.py”,第 877 行,在 load_models_from_stable_diffusion_checkpoint converted_unet_checkpoint = convert_ldm_unet_checkpoint(v2, state_dict, unet_config) 文件 “F:\lora-scripts-0.2.0\sd-scripts\library\model_util.py”,第 234 行,convert_ldm_unet_checkpoint new_checkpoint[“time_embedding.linear_1.weight”] = unet_state_dict[“time_embed.0.weight”] KeyError: 'time_embed.0.weight' 回溯(最近一次调用): 文件 “C:\Users\keina\AppData\Local\Programs\Python\Python310\lib\runpy.py”,第 196 行,在_run_module_as_main返回_run_code(代码、main_globals、无、文件“C:\Users\keina\AppData\Local\Programs\Python\Python310\lib\runpy.py”,第 86 行, 在run_code exec(code,run_globals)文件中 “F:\lora-scripts-0.2.0\venv\Scripts\accelerate.exe__main_.py”,第 7 行,在文件 “F:\lora-scripts-0.2.0\venv\lib\site-packages\accelerate\command\accelerate_cli.py”中,第 45 行,在主 args.func(args) 文件中 “F:\lora-scripts-0.2.0\venv\lib\site-packages\accelerate\command\launch.py”,第 1104 行,在 launch_command simple_launcher(args) 中文件“F:\lora-scripts-0.2.0\venv\lib\site-packages\accelerate\command\launch.py”,第 567 行,simple_launcher raise 子进程。CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['F:\lora-scripts-0.2.0\venv\Scripts\python.exe', './sd-scripts/train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=./sd-models/before.safetensors', '--train_data_dir=./train', '--output_dir=./output', '--logging_dir=./logs', '--resolution=512,768', '--network_module=networks.lora', '--max_train_epochs=20', '--learning_rate=1e-4', '--unet_lr=1e-4', '--text_encoder_lr=1e-5', '--lr_scheduler=cosine_with_restarts', '--lr_warmup_steps=0', '--lr_scheduler_num_cycles=1', '--network_dim=64', '--network_alpha=32', '--output_name=之后', '--train_batch_size=3', '--save_every_n_epochs=2', '--mixed_precision=fp16', '--save_precision=fp16', '--种子=1337', '--cache_latents', '--clip_skip=2', '--prior_loss_weight=1', '--max_token_length=225', '--caption_extension=.txt', '--save_model_as=安全张量', '--min_bucket_reso=256', '--max_bucket_reso=1024', '--xformers', '--shuffle_caption', '--reg_data_dir=./train/reg', '--use_8bit_adam', '--use_lion_optimizer']' 返回非零退出状态 1。火车完成

嗨,我知道怎么解决,只要把训练批次大小调小就没问题,我的3060ti最多能调到4

youcanyoubing avatar Mar 10 '23 20:03 youcanyoubing

nothing worked, i tried everything. i still get the same error :(

Unactivate 8bit adams

What do you mean unactivate 8bit adams? Like selecting adams instead of 8bit adams or deleting it from kohya ss folder? If it's the latter, then please provide the folder directory where I have to delete it,

Currently, after replacing all 3 instances of train.util.py, I'm getting this error:

Traceback (most recent call last): File "C:\Users\name\Kohya\kohya_ss\train_network.py", line 16, in <module> import library.train_util as train_util File "C:\Users\name\Kohya\kohya_ss\library\train_util.py", line 59, in <module> from library.lpw_stable_diffusion import StableDiffusionLongPromptWeightingPipeline ModuleNotFoundError: No module named 'library.lpw_stable_diffusion' Traceback (most recent call last): File "C:\Users\name\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\name\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "C:\Users\name\Kohya\kohya_ss\venv\Scripts\accelerate.exe\__main__.py", line 7, in <module> File "C:\Users\name\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main args.func(args) File "C:\Users\name\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command simple_launcher(args) File "C:\Users\name\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['C:\\Users\\name\\Kohya\\kohya_ss\\venv\\Scripts\\python.exe', 'train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=C:/Users/name/stable-diffusion-webui/models/Stable-diffusion/chilloutmix_NiPrunedFp32Fix.safetensors', '--train_data_dir=C:/Users/name/Kohya/LoRA/img', '--resolution=512,512', '--output_dir=C:/Users/name/Kohya/LoRA/model', '--logging_dir=C:/Users/name/Kohya/LoRA/log', '--network_alpha=128', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-5', '--unet_lr=0.0001', '--network_dim=128', '--output_name=Ayane Sakura', '--lr_scheduler_num_cycles=1', '--learning_rate=0.0001', '--lr_scheduler=constant', '--train_batch_size=1', '--max_train_steps=4700', '--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--seed=1234', '--caption_extension=.txt', '--cache_latents', '--optimizer_type=AdamW', '--max_data_loader_n_workers=1', '--clip_skip=2', '--bucket_reso_steps=64', '--mem_eff_attn', '--gradient_checkpointing', '--xformers']' returned non-zero exit status 1.

AniMoster avatar Mar 23 '23 03:03 AniMoster