bitsandbytes
bitsandbytes copied to clipboard
Lora training fails despite python -m bitsandbytes all positive
System Info
Distributor ID: Pop Description: Pop!_OS 22.04 LTS Release: 22.04 Codename: jammy
running in proxmox with RTX 4070 ti passtrow.
Reproduction
Trying to train Lora, always ends here.
Warning: LD_LIBRARY_PATH environment variable is not set.
Certain functionalities may not work correctly.
Please ensure that the required libraries are properly configured.
If you use WSL2 you may want to: export LD_LIBRARY_PATH=/usr/lib/wsl/lib/
09:41:58-147913 INFO Version: v22.5.0
09:41:58-150370 INFO nVidia toolkit detected
09:41:59-283380 INFO Torch 2.0.1+cu118
09:41:59-290575 INFO Torch backend: nVidia CUDA 11.8 cuDNN 8700
09:41:59-300734 INFO Torch detected GPU: NVIDIA GeForce RTX 4070 Ti VRAM
12010 Arch (8, 9) Cores 60
09:41:59-301351 INFO Verifying modules installation status from
/home/ai/kohya/requirements_linux.txt...
09:41:59-302849 INFO Verifying modules installation status from
requirements.txt...
09:42:00-810398 INFO headless: False
09:42:00-812071 INFO Load CSS...
Running on local URL: http://10.10.1.5:7861
To create a public link, set share=True in launch().
09:42:25-109421 INFO Loading config...
09:42:36-427946 INFO Start training LoRA Standard ...
09:42:36-428482 INFO Checking for duplicate image filenames in training data
directory...
09:42:36-440140 INFO Valid image folder names found in:
/home/ai/kohya/TRAINING/images
09:42:36-441540 INFO Valid image folder names found in:
/home/ai/kohya/TRAINING/regularization
09:42:36-443066 INFO Folder 100_catalyst: 617 images found
09:42:36-443723 INFO Folder 100_catalyst: 61700 steps
09:42:36-444252 WARNING Regularisation images are used... Will double the
number of steps required...
09:42:36-444714 INFO Total steps: 61700
09:42:36-445040 INFO Train batch size: 2
09:42:36-445354 INFO Gradient accumulation steps: 1
09:42:36-445680 INFO Epoch: 1
09:42:36-445989 INFO Regulatization factor: 2
09:42:36-446302 INFO max_train_steps (61700 / 2 / 1 * 1 * 2) = 61700
09:42:36-446686 INFO stop_text_encoder_training = 0
09:42:36-447021 INFO lr_warmup_steps = 0
09:42:36-447388 INFO Saving training config to
/home/ai/kohya/TRAINING/model/catalyst_0.1_20240122-094
236.json...
09:42:36-448452 INFO accelerate launch --num_cpu_threads_per_process=2
"./train_network.py"
--pretrained_model_name_or_path="runwayml/stable-diffus
ion-v1-5"
--train_data_dir="/home/ai/kohya/TRAINING/images"
--reg_data_dir="/home/ai/kohya/TRAINING/regularization"
--resolution="768,768"
--output_dir="/home/ai/kohya/TRAINING/model"
--logging_dir="/home/ai/kohya/TRAINING/log"
--network_alpha="1" --save_model_as=safetensors
--network_module=networks.lora --network_dim=8
--output_name="catalyst_0.1"
--lr_scheduler_num_cycles="1" --learning_rate="0.0001"
--lr_scheduler="constant" --train_batch_size="2"
--max_train_steps="61700" --save_every_n_epochs="1"
--mixed_precision="bf16" --save_precision="bf16"
--seed="1234" --caption_extension=".txt"
--cache_latents --optimizer_type="AdamW8bit"
--max_grad_norm="1" --max_data_loader_n_workers="1"
--clip_skip=2 --bucket_reso_steps=64 --xformers
--bucket_no_upscale --noise_offset=0.0
The following values were not passed to accelerate launch and had defaults used instead:
--num_processes was set to a value of 1
--num_machines was set to a value of 1
--mixed_precision was set to a value of 'no'
--dynamo_backend was set to a value of 'no'
To avoid this warning pass in values for each of the problematic parameters or run accelerate config.
2024-01-22 09:42:38.744675: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0.
2024-01-22 09:42:38.908373: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-01-22 09:42:38.908402: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-01-22 09:42:38.909467: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-01-22 09:42:38.983436: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-01-22 09:42:39.660678: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
prepare tokenizer
Using DreamBooth method.
prepare images.
found directory /home/ai/kohya/TRAINING/images/100_catalyst contains 617 image files
found directory /home/ai/kohya/TRAINING/regularization/1_illustration style contains 1000 image files
No caption file found for 1000 images. Training will continue without captions for these images. If class token exists, it will be used. / 1000枚の画像にキャプションファイルが見つかりませんでした。これらの画像についてはキャプションなしで学習を続行します。class tokenが存在する場合はそれを使います。
/home/ai/kohya/TRAINING/regularization/1_illustration style/00000-172513325-illustration style.png
/home/ai/kohya/TRAINING/regularization/1_illustration style/00001-172513326-illustration style.png
/home/ai/kohya/TRAINING/regularization/1_illustration style/00002-172513327-illustration style.png
/home/ai/kohya/TRAINING/regularization/1_illustration style/00003-172513328-illustration style.png
/home/ai/kohya/TRAINING/regularization/1_illustration style/00004-172513329-illustration style.png
/home/ai/kohya/TRAINING/regularization/1_illustration style/00005-172513330-illustration style.png... and 995 more
61700 train images with repeating.
1000 reg images.
[Dataset 0]
batch_size: 2
resolution: (768, 768)
enable_bucket: False[Subset 0 of Dataset 0]
image_dir: "/home/ai/kohya/TRAINING/images/100_catalyst"
image_count: 617
num_repeats: 100
shuffle_caption: False
keep_tokens: 0
keep_tokens_separator:
caption_dropout_rate: 0.0
caption_dropout_every_n_epoches: 0
caption_tag_dropout_rate: 0.0
caption_prefix: None
caption_suffix: None
color_aug: False
flip_aug: False
face_crop_aug_range: None
random_crop: False
token_warmup_min: 1,
token_warmup_step: 0,
is_reg: False
class_tokens: catalyst
caption_extension: .txt
[Subset 1 of Dataset 0]
image_dir: "/home/ai/kohya/TRAINING/regularization/1_illustration style"
image_count: 1000
num_repeats: 1
shuffle_caption: False
keep_tokens: 0
keep_tokens_separator:
caption_dropout_rate: 0.0
caption_dropout_every_n_epoches: 0
caption_tag_dropout_rate: 0.0
caption_prefix: None
caption_suffix: None
color_aug: False
flip_aug: False
face_crop_aug_range: None
random_crop: False
token_warmup_min: 1,
token_warmup_step: 0,
is_reg: True
class_tokens: illustration style
caption_extension: .txt
[Dataset 0]
loading image sizes.
100%|█████████████████████████████████████| 1617/1617 [00:00<00:00, 3574.95it/s]
prepare dataset
preparing accelerator
loading model for process 0/1
load Diffusers pretrained models: runwayml/stable-diffusion-v1-5
Loading pipeline components...: 100%|█████████████| 5/5 [00:00<00:00, 11.09it/s]
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing safety_checker=None. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .
UNet2DConditionModel: 64, 8, 768, False, False
U-Net converted to original U-Net
Enable xformers for U-Net
import network module: networks.lora
[Dataset 0]
caching latents.
checking cache validity...
100%|██████████████████████████████████| 1617/1617 [00:00<00:00, 3259101.19it/s]
caching latents...
100%|███████████████████████████████████████| 1617/1617 [02:43<00:00, 9.91it/s]
create LoRA network. base dim (rank): 8, alpha: 1.0
neuron dropout: p=None, rank dropout: p=None, module dropout: p=None
create LoRA for Text Encoder:
create LoRA for Text Encoder: 72 modules.
create LoRA for U-Net: 192 modules.
enable LoRA for text encoder
enable LoRA for U-Net
prepare optimizer, data loader etc.
False
===================================BUG REPORT===================================
/home/ai/kohya/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:166: UserWarning: Welcome to bitsandbytes. For bug reports, please run
python -m bitsandbytes
nspect the output of the command and see if you can locate CUDA libraries. You might need to add them
to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues
Traceback (most recent call last):
File "/home/ai/kohya/venv/bin/accelerate", line 8, in <module>
sys.exit(main())
File "/home/ai/kohya/venv/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 47, in main
args.func(args)
File "/home/ai/kohya/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1017, in launch_command
simple_launcher(args)
File "/home/ai/kohya/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 637, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/ai/kohya/venv/bin/python', './train_network.py', '--pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5', '--train_data_dir=/home/ai/kohya/TRAINING/images', '--reg_data_dir=/home/ai/kohya/TRAINING/regularization', '--resolution=768,768', '--output_dir=/home/ai/kohya/TRAINING/model', '--logging_dir=/home/ai/kohya/TRAINING/log', '--network_alpha=1', '--save_model_as=safetensors', '--network_module=networks.lora', '--network_dim=8', '--output_name=catalyst_0.1', '--lr_scheduler_num_cycles=1', '--learning_rate=0.0001', '--lr_scheduler=constant', '--train_batch_size=2', '--max_train_steps=61700', '--save_every_n_epochs=1', '--mixed_precision=bf16', '--save_precision=bf16', '--seed=1234', '--caption_extension=.txt', '--cache_latents', '--optimizer_type=AdamW8bit', '--max_grad_norm=1', '--max_data_loader_n_workers=1', '--clip_skip=2', '--bucket_reso_steps=64', '--xformers', '--bucket_no_upscale', '--noise_offset=0.0']' returned non-zero exit status 1.```
and the output from the bitsandbites command:
```++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++ BUG REPORT INFORMATION ++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++ /usr/local CUDA PATHS +++++++++++++++++++
/usr/local/cuda-12.3/targets/x86_64-linux/lib/stubs/libcuda.so
/usr/local/cuda-12.3/targets/x86_64-linux/lib/libcudart.so
+++++++++++++++ WORKING DIRECTORY CUDA PATHS +++++++++++++++
/home/ai/kohya/venv/lib/python3.10/site-packages/onnxruntime/capi/libonnxruntime_providers_cuda.so
/home/ai/kohya/venv/lib/python3.10/site-packages/torch/lib/libtorch_cuda_linalg.so
/home/ai/kohya/venv/lib/python3.10/site-packages/torch/lib/libc10_cuda.so
/home/ai/kohya/venv/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so
/home/ai/kohya/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda114_nocublaslt.so
/home/ai/kohya/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda111.so
/home/ai/kohya/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda111_nocublaslt.so
/home/ai/kohya/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda117.so
/home/ai/kohya/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda120_nocublaslt.so
/home/ai/kohya/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda122.so
/home/ai/kohya/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda121_nocublaslt.so
/home/ai/kohya/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda117_nocublaslt.so
/home/ai/kohya/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda122_nocublaslt.so
/home/ai/kohya/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda115_nocublaslt.so
/home/ai/kohya/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda110_nocublaslt.so
/home/ai/kohya/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda118.so
/home/ai/kohya/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda120.so
/home/ai/kohya/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda114.so
/home/ai/kohya/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda118_nocublaslt.so
/home/ai/kohya/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda121.so
/home/ai/kohya/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda110.so
/home/ai/kohya/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda115.so
++++++++++++++++++ LD_LIBRARY CUDA PATHS +++++++++++++++++++
++++++++++++++++++++++++++ OTHER +++++++++++++++++++++++++++
COMPILED_WITH_CUDA = True
COMPUTE_CAPABILITIES_PER_GPU = ['8.9']
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++ DEBUG INFO END ++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Running a quick check that:
+ library is importable
+ CUDA function is callable
WARNING: Please be sure to sanitize sensible info from any such env vars!
SUCCESS!
Installation was successful!```
### Expected behavior
Training a Lora
i met the same error