kohya_ss icon indicating copy to clipboard operation
kohya_ss copied to clipboard

NameError: name 'str2optimizer8bit_blockwise' is not defined

Open andirsun opened this issue 1 year ago • 7 comments

Hi I am trying to train my model but I have this error.

I am using Ubuntu server with NVIDIA GPU

NameError: name 'str2optimizer8bit_blockwise' is not defined

Logs

Folder 100_brooklyn: 2000 steps
max_train_steps = 2000
stop_text_encoder_training = 0
lr_warmup_steps = 200
accelerate launch --num_cpu_threads_per_process=2 "train_network.py" --enable_bucket --pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5" --train_data_dir="/home/ubuntu/datasets/lora/brooklyn/image" --resolution=512,512 --output_dir="/home/ubuntu/datasets/lora/brooklyn/model" --logging_dir="/home/ubuntu/datasets/lora/brooklyn/log" --network_alpha="1" --save_model_as=safetensors --network_module=networks.lora --text_encoder_lr=5e-5 --unet_lr=0.0001 --network_dim=8 --output_name="last" --lr_scheduler_num_cycles="1" --learning_rate="0.0001" --lr_scheduler="cosine" --lr_warmup_steps="200" --train_batch_size="1" --max_train_steps="2000" --save_every_n_epochs="1" --mixed_precision="fp16" --save_precision="fp16" --cache_latents --optimizer_type="AdamW8bit" --bucket_reso_steps=64 --xformers --bucket_no_upscale
2023-03-08 15:58:31.662450: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-03-08 15:58:31.801941: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-03-08 15:58:31.838703: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-03-08 15:58:32.395583: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory
2023-03-08 15:58:32.395648: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory
2023-03-08 15:58:32.395659: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
2023-03-08 15:58:34.018512: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-03-08 15:58:34.161332: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-03-08 15:58:34.197477: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-03-08 15:58:34.748948: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory
2023-03-08 15:58:34.749008: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory
2023-03-08 15:58:34.749020: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
prepare tokenizer
Use DreamBooth method.
prepare train images.
found directory 100_brooklyn contains 20 image files
2000 train images with repeating.
loading image sizes.
100%|████████████████████████████████████████████████████████████████████████████| 20/20 [00:00<00:00, 4137.62it/s]
make buckets
min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is set, because bucket reso is defined by image size automatically / bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計算されるため、min_bucket_resoとmax_bucket_resoは無視されます
number of images (including repeats) / 各bucketの画像枚数(繰り返し回数を含む)
bucket 0: resolution (512, 512), count: 2000
mean ar error (without repeats): 0.0
prepare accelerator
Using accelerator 0.15.0 or above.
load Diffusers pretrained models
Fetching 15 files: 100%|████████████████████████████████████████████████████████| 15/15 [00:00<00:00, 26247.21it/s]
/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/transformers/models/clip/feature_extraction_clip.py:28: FutureWarning: The class CLIPFeatureExtractor is deprecated and will be removed in version 5 of Transformers. Please use CLIPImageProcessor instead.
  warnings.warn(
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing `safety_checker=None`. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .
Replace CrossAttention.forward to use xformers
caching latents.
100%|██████████████████████████████████████████████████████████████████████████████| 20/20 [00:03<00:00,  5.88it/s]
import network module: networks.lora
create LoRA for Text Encoder: 72 modules.
create LoRA for U-Net: 192 modules.
enable LoRA for text encoder
enable LoRA for U-Net
prepare optimizer, data loader etc.

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
For effortless bug reporting copy-paste your error into this form: https://docs.google.com/forms/d/e/1FAIpQLScPB8emS3Thkp66nvqwmjTEgxp8Y9ufuWTzFyr9kJ5AoI47dQ/viewform?usp=sf_link
================================================================================
/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/paths.py:27: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/cv2/../../lib64')}
  warn(
/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/paths.py:105: UserWarning: /home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/cv2/../../lib64: did not contain libcudart.so as expected! Searching further paths...
  warn(
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/paths.py:27: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/usr/local/cuda/lib64')}
  warn(
WARNING: No libcudart.so found! Install CUDA or the cudatoolkit package (anaconda)!
CUDA SETUP: Loading binary /home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so...
/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cextension.py:48: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers and GPU quantization are unavailable.
  warn(
use 8-bit AdamW optimizer | {}
running training / 学習開始
  num train images * repeats / 学習画像の数×繰り返し回数: 2000
  num reg images / 正則化画像の数: 0
  num batches per epoch / 1epochのバッチ数: 2000
  num epochs / epoch数: 1
  batch size per device / バッチサイズ: 1
  total train batch size (with parallel & distributed & accumulation) / 総バッチサイズ(並列学習、勾配合計含む): 1
  gradient accumulation steps / 勾配を合計するステップ数 = 1
  total optimization steps / 学習ステップ数: 2000
steps:   0%|                                                                              | 0/2000 [00:00<?, ?it/s]epoch 1/1
Traceback (most recent call last):
  File "/home/ubuntu/kohya_ss/train_network.py", line 507, in <module>
    train(args)
  File "/home/ubuntu/kohya_ss/train_network.py", line 394, in train
    optimizer.step()
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/accelerate/optimizer.py", line 134, in step
    self.scaler.step(self.optimizer, closure)
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/torch/cuda/amp/grad_scaler.py", line 338, in step
    retval = self._maybe_opt_step(optimizer, optimizer_state, *args, **kwargs)
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/torch/cuda/amp/grad_scaler.py", line 285, in _maybe_opt_step
    retval = optimizer.step(*args, **kwargs)
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/torch/optim/lr_scheduler.py", line 65, in wrapper
    return wrapped(*args, **kwargs)
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/torch/optim/optimizer.py", line 113, in wrapper
    return func(*args, **kwargs)
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/optim/optimizer.py", line 265, in step
    self.update_step(group, p, gindex, pindex)
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/optim/optimizer.py", line 506, in update_step
    F.optimizer_update_8bit_blockwise(
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/functional.py", line 858, in optimizer_update_8bit_blockwise
    str2optimizer8bit_blockwise[optimizer_name][0](
NameError: name 'str2optimizer8bit_blockwise' is not defined
steps:   0%|                                                                              | 0/2000 [00:01<?, ?it/s]
Traceback (most recent call last):
  File "/home/ubuntu/kohya_ss/venv/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 45, in main
    args.func(args)
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1104, in launch_command
    simple_launcher(args)
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 567, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/ubuntu/kohya_ss/venv/bin/python3', 'train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5', '--train_data_dir=/home/ubuntu/datasets/lora/brooklyn/image', '--resolution=512,512', '--output_dir=/home/ubuntu/datasets/lora/brooklyn/model', '--logging_dir=/home/ubuntu/datasets/lora/brooklyn/log', '--network_alpha=1', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-5', '--unet_lr=0.0001', '--network_dim=8', '--output_name=last', '--lr_scheduler_num_cycles=1', '--learning_rate=0.0001', '--lr_scheduler=cosine', '--lr_warmup_steps=200', '--train_batch_size=1', '--max_train_steps=2000', '--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--cache_latents', '--optimizer_type=AdamW8bit', '--bucket_reso_steps=64', '--xformers', '--bucket_no_upscale']' returned non-zero exit status 1.

andirsun avatar Mar 08 '23 16:03 andirsun

Maybe your card does not support 8bitadam? WHat does it do if you try AdamW instead of AdamW8bit?

bmaltais avatar Mar 08 '23 16:03 bmaltais

Nvidia Card:

ubuntu@ip-172-31-2-139:~/datasets/lora/brooklyn$ nvidia-smi --query-gpu=name --format=csv,noheader
Tesla T4

Another error after change to AdamW

Traceback (most recent call last):
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/gradio/routes.py", line 384, in run_predict
    output = await app.get_blocks().process_api(
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1024, in process_api
    result = await self.call_function(
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/gradio/blocks.py", line 836, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/home/ubuntu/kohya_ss/textual_inversion_gui.py", line 308, in train_model
    msgbox('Image folder path is missing')
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/easygui/boxes/derived_boxes.py", line 230, in msgbox
    return buttonbox(msg=msg,
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/easygui/boxes/button_box.py", line 95, in buttonbox
    bb = ButtonBox(
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/easygui/boxes/button_box.py", line 147, in __init__
    self.ui = GUItk(msg, title, choices, images, default_choice, cancel_choice, self.callback_ui)
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/easygui/boxes/button_box.py", line 263, in __init__
    self.boxRoot = tk.Tk()
  File "/usr/lib/python3.10/tkinter/__init__.py", line 2299, in __init__
    self.tk = _tkinter.create(screenName, baseName, className, interactive, wantobjects, useTk, sync, use)
_tkinter.TclError: no display name and no $DISPLAY environment variable

andirsun avatar Mar 08 '23 19:03 andirsun

I wish I could help but a user created the PR to support the GUI on Linux but I am not myself using Linux... so I can't help with the issue... But it appear to be related to missing env configuration based on the last line...

Maybe Linux does not install the tk module by default?

bmaltais avatar Mar 08 '23 19:03 bmaltais

I am going to test

andirsun avatar Mar 10 '23 02:03 andirsun

I had the same error about AdamW8bit, I switched to AdamW and it worked

Also before doing all this I had to do something like apt-get update -y & apt-get install python3-tk -y because of tkinter not being found

ghost avatar Mar 14 '23 15:03 ghost

@andirsun Check if you provided a correct folder path to your image folder. You can see in the middle of your error message this: File "/home/ubuntu/kohya_ss/textual_inversion_gui.py", line 308, in train_model msgbox('Image folder path is missing')

Make sure to use a correct absolute path, like /workspace/myloratraining/image

ppetrucz avatar Mar 23 '23 14:03 ppetrucz

I changed to AdamW instead AdamW8bit. Trouble is cleared.. but trainning speed too slow...

ohminy avatar Mar 24 '23 09:03 ohminy

self.tk = _tkinter.create(screenName, baseName, className, interactive, wantobjects, useTk, sync, use) _tkinter.TclError: no display name and no $DISPLAY environment variable

@andirsun I changed Model output name, and resolved the issue.

polym avatar Apr 24 '23 07:04 polym

Traceback (most recent call last): File "/data/7082/venv/lib/python3.8/site-packages/gradio/routes.py", line 399, in run_predict output = await app.get_blocks().process_api( File "/data/7082/venv/lib/python3.8/site-packages/gradio/blocks.py", line 1299, in process_api result = await self.call_function( File "/data/7082/venv/lib/python3.8/site-packages/gradio/blocks.py", line 1022, in call_function prediction = await anyio.to_thread.run_sync( File "/data/7082/venv/lib/python3.8/site-packages/anyio/to_thread.py", line 31, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "/data/7082/venv/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread return await future File "/data/7082/venv/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 867, in run result = context.run(func, *args) File "/data/7082/library/common_gui.py", line 240, in get_folder_path root = Tk() File "/usr/lib/python3.8/tkinter/init.py", line 2270, in init self.tk = _tkinter.create(screenName, baseName, className, interactive, wantobjects, useTk, sync, use) _tkinter.TclError: no display name and no $DISPLAY environment variable

I have tried almost every method online, but still can not fix it.......

24mlight avatar May 10 '23 02:05 24mlight

@bmaltais same error, but I can't switch to AdamW because my GPU memory is not enough, I have to rely on AdamW8bit to optimize the memory, how should I solve this problem.

dailingx avatar May 10 '23 05:05 dailingx

I am going to test

How did you solve this problem

dailingx avatar May 10 '23 05:05 dailingx

Traceback (most recent call last): File "/data/7082/venv/lib/python3.8/site-packages/gradio/routes.py", line 399, in run_predict output = await app.get_blocks().process_api( File "/data/7082/venv/lib/python3.8/site-packages/gradio/blocks.py", line 1299, in process_api result = await self.call_function( File "/data/7082/venv/lib/python3.8/site-packages/gradio/blocks.py", line 1022, in call_function prediction = await anyio.to_thread.run_sync( File "/data/7082/venv/lib/python3.8/site-packages/anyio/to_thread.py", line 31, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "/data/7082/venv/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread return await future File "/data/7082/venv/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 867, in run result = context.run(func, *args) File "/data/7082/library/common_gui.py", line 240, in get_folder_path root = Tk() File "/usr/lib/python3.8/tkinter/init.py", line 2270, in init self.tk = _tkinter.create(screenName, baseName, className, interactive, wantobjects, useTk, sync, use) _tkinter.TclError: no display name and no $DISPLAY environment variable

I have tried almost every method online, but still can not fix it.......

I have solved this problem! 760

24mlight avatar May 10 '23 05:05 24mlight

@bmaltais same error, but I can't switch to AdamW because my GPU memory is not enough, I have to rely on AdamW8bit to optimize the memory, how should I solve this problem.

my GPU is T4, this is my error:

84 train images with repeating.
0 reg images.
no regularization images / 正則化画像が見つかりませんでした
[Dataset 0]
  batch_size: 1
  resolution: (512, 512)
  enable_bucket: False

  [Subset 0 of Dataset 0]
    image_dir: "/home/stable-diffusion/lora-scripts/train/mmc-chick2/7_ConceptChick"
    image_count: 12
    num_repeats: 7
    shuffle_caption: False
    keep_tokens: 0
    caption_dropout_rate: 0.0
    caption_dropout_every_n_epoches: 0
    caption_tag_dropout_rate: 0.0
    color_aug: False
    flip_aug: False
    face_crop_aug_range: None
    random_crop: False
    token_warmup_min: 1,
    token_warmup_step: 0,
    is_reg: False
    class_tokens: ConceptChick
    caption_extension: .caption


[Dataset 0]
loading image sizes.
100%|██████████| 12/12 [00:00<00:00, 2434.66it/s]
prepare dataset
prepare accelerator
Using accelerator 0.15.0 or above.
load Diffusers pretrained models
Fetching 15 files: 100%|██████████| 15/15 [00:00<00:00, 60669.78it/s]
/home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/transformers/modeling_utils.py:402: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  with safe_open(checkpoint_file, framework="pt") as f:
/home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/torch/_utils.py:776: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  return self.fget.__get__(instance, owner)()
/home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/torch/storage.py:899: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  storage = cls(wrap_storage=untyped_storage)
/home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/safetensors/torch.py:98: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  with safe_open(filename, framework="pt", device=device) as f:
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing `safety_checker=None`. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .
Replace CrossAttention.forward to use xformers
[Dataset 0]
caching latents.
100%|██████████| 12/12 [00:02<00:00,  4.57it/s]
/home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cextension.py:33: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
  warn("The installed version of bitsandbytes was compiled without GPU support. "
/home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:145: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/cv2/../../lib64')}
  warn(msg)
/home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:145: UserWarning: /home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/cv2/../../lib64: did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] as expected! Searching further paths...
  warn(msg)
/home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:145: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/usr/bin/nohup /home/stable-diffusion/kohya_ss/gui.sh --listen 11.167.91.148 --server_port 7861 --inbrowser')}
  warn(msg)
/home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:145: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/usr/local/cuda/lib64')}
  warn(msg)
/home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:145: UserWarning: WARNING: No libcudart.so found! Install CUDA or the cudatoolkit package (anaconda)!
  warn(msg)
prepare optimizer, data loader etc.

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

 and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
bin /home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths...
CUDA SETUP: Highest compute capability among GPUs detected: 7.5
CUDA SETUP: Detected CUDA version 117
CUDA SETUP: Loading binary /home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so...
use 8-bit AdamW optimizer | {}
running training / 学習開始
  num train images * repeats / 学習画像の数×繰り返し回数: 84
  num reg images / 正則化画像の数: 0
  num batches per epoch / 1epochのバッチ数: 84
  num epochs / epoch数: 1
  batch size per device / バッチサイズ: 1
  total train batch size (with parallel & distributed & accumulation) / 総バッチサイズ(並列学習、勾配合計含む): 1
  gradient ccumulation steps / 勾配を合計するステップ数 = 1
  total optimization steps / 学習ステップ数: 84
steps:   0%|          | 0/84 [00:00<?, ?it/s]/home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/xformers/ops/fmha/flash.py:338: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  and inp.query.storage().data_ptr() == inp.key.storage().data_ptr()
epoch 1/1
Traceback (most recent call last):
  File "/home/stable-diffusion/kohya_ss/train_db.py", line 463, in <module>
    train(args)
  File "/home/stable-diffusion/kohya_ss/train_db.py", line 327, in train
    optimizer.step()
  File "/home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/accelerate/optimizer.py", line 134, in step
    self.scaler.step(self.optimizer, closure)
  File "/home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/torch/cuda/amp/grad_scaler.py", line 370, in step
    retval = self._maybe_opt_step(optimizer, optimizer_state, *args, **kwargs)
  File "/home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/torch/cuda/amp/grad_scaler.py", line 290, in _maybe_opt_step
    retval = optimizer.step(*args, **kwargs)
  File "/home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/torch/optim/lr_scheduler.py", line 69, in wrapper
    return wrapped(*args, **kwargs)
  File "/home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/torch/optim/optimizer.py", line 280, in wrapper
    out = func(*args, **kwargs)
  File "/home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/optim/optimizer.py", line 263, in step
    self.update_step(group, p, gindex, pindex)
  File "/home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/optim/optimizer.py", line 504, in update_step
    F.optimizer_update_8bit_blockwise(
  File "/home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/functional.py", line 975, in optimizer_update_8bit_blockwise
    str2optimizer8bit_blockwise[optimizer_name][0](
NameError: name 'str2optimizer8bit_blockwise' is not defined
steps:   0%|          | 0/84 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/home/stable-diffusion/kohya_ss/venv/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 45, in main
    args.func(args)
  File "/home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1104, in launch_command
    simple_launcher(args)
  File "/home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 567, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/stable-diffusion/kohya_ss/venv/bin/python', 'train_db.py', '--pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5', '--train_data_dir=/home/stable-diffusion/lora-scripts/train/mmc-chick2', '--resolution=512,512', '--output_dir=/home/stable-diffusion/kohya_ss/train/Stable-diffusion', '--logging_dir=/home/stable-diffusion/kohya_ss/logs', '--save_model_as=safetensors', '--output_name=last', '--max_data_loader_n_workers=0', '--learning_rate=1e-5', '--lr_scheduler=constant', '--train_batch_size=1', '--max_train_steps=84', '--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--cache_latents', '--optimizer_type=AdamW8bit', '--max_data_loader_n_workers=0', '--clip_skip=2', '--bucket_reso_steps=64', '--gradient_checkpointing', '--xformers', '--bucket_no_upscale']' returned non-zero exit status 1.

dailingx avatar May 10 '23 06:05 dailingx

@bmaltais same error, but I can't switch to AdamW because my GPU memory is not enough, I have to rely on AdamW8bit to optimize the memory, how should I solve this problem.

Oh! I fixed this problem. Through this piece of log information,

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

 and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
bin /home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths...
CUDA SETUP: Highest compute capability among GPUs detected: 7.5
CUDA SETUP: Detected CUDA version 117
CUDA SETUP: Loading binary /home/stable-diffusion/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so...

mainly look at WARNING! I found that my CUDA version was loaded incorrectly, the CUDA version I installed in the venv environment is 113, why the correct CUDA version is not obtained during the running process? why Searching in backup paths? Then I read the source code back to https://github.com/TimDettmers/bitsandbytes/blob/main/bitsandbytes/cuda_setup/main.py, and found that the environment variable LD_LIBRARY_PATH was used to get CUDA information, but LD_LIBRARY_PATH was empty during the running of my program, and I checked that this variable is valid on the Linux machine. Therefore, the value of this variable must be reset during the running. Finally, I found that my user identity on Linux is admin, and I used the sudo command when running gui.sh to ensure the installation of some runtime dependencies, but the sudo command resets the environment variables! Now, I have fixed this issue in this PR: https://github.com/bmaltais/kohya_ss/pull/769, and this is what worked for me!

dailingx avatar May 11 '23 08:05 dailingx