returned non-zero exit status 1.
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['F:\Engineering\Python\Python310\python.exe', 'train_network.py', '--pretrained_model_name_or_path=F:/Engineering/AI_Painting/stable-diffusion-webui/models/Stable-diffusion/pastelmix-better-vae-fp16.ckpt', '--train_data_dir=F:/Engineering/AI_Painting/LORA_training/train_data', '--resolution=512,512', '--output_dir=F:/Engineering/AI_Painting/LORA_training/output_models', '--logging_dir=', '--network_alpha=128', '--save_model_as=ckpt', '--network_module=networks.lora', '--text_encoder_lr=5e-5', '--unet_lr=0.0001', '--network_dim=128', '--output_name=Addams', '--lr_scheduler_num_cycles=1', '--learning_rate=0.0001', '--lr_scheduler=constant', '--train_batch_size=2', '--max_train_steps=750', '--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--seed=1234', '--caption_extension=.txt', '--cache_latents', '--optimizer_type=AdamW8bit', '--max_data_loader_n_workers=1', '--clip_skip=2', '--bucket_reso_steps=64', '--xformers', '--bucket_no_upscale']' returned non-zero exit status 1.
Need the traceback section. This section of the error does not provide enough information to troubleshoot.
Traceback (most recent call last):
File "F:\Engineering\AI_Painting\kohya_ss\train_network.py", line 699, in
The error message indicates that the ValueError occurred because the fp16 mixed precision requires a GPU. What type of NVDia card do you have in your system?
My GPU is 3080Ti. Could it be that my CUDA or cudnn is not installed correctly?
2023-03-25 13:45:22.622970: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found 2023-03-25 13:45:22.623110: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. 2023-03-25 13:45:25.906128: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found 2023-03-25 13:45:25.906238: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for: PyTorch 1.13.1+cu117 with CUDA 1107 (you have 1.13.1+cpu) Python 3.10.9 (you have 3.10.9) Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers) Memory-efficient attention, SwiGLU, sparse and more won't be available. Set XFORMERS_MORE_DETAILS=1 for more details prepare tokenizer Use DreamBooth method. prepare images. found directory F:\Engineering\AI_Painting\LORA_training\train_data\150_flowers contains 30 image files 4500 train images with repeating. 0 reg images. no regularization images / 正則化画像が見つかりませんでした [Dataset 0] batch_size: 2 resolution: (512, 512) enable_bucket: False
[Subset 0 of Dataset 0] image_dir: "F:\Engineering\AI_Painting\LORA_training\train_data\150_flowers" image_count: 30 num_repeats: 150 shuffle_caption: False keep_tokens: 0 caption_dropout_rate: 0.0 caption_dropout_every_n_epoches: 0 caption_tag_dropout_rate: 0.0 color_aug: False flip_aug: False face_crop_aug_range: None random_crop: False is_reg: False class_tokens: flowers caption_extension: .txt
Do these messages help to solve the problem?
same problem,need help
same problem,need help
The first two messages indicate that the system is trying to load the CUDA library (cudart64_110.dll) but cannot find it. This means that your TensorFlow installation is looking for a GPU but cannot access it since the required library is missing. If you don't have a GPU and are using a CPU for your computation, you can safely ignore these warnings.
The third message suggests that xFormers is not able to load C++/CUDA extensions because the library was built for a different PyTorch and CUDA version. This may affect some functionalities of xFormers, such as memory-efficient attention, SwiGLU, and sparse operations.
Based on this I think you should delete the who kohya_ss and redo the installation fron scratch.