kohya_ss icon indicating copy to clipboard operation
kohya_ss copied to clipboard

M3 MAX 40 core GPU , 128 GB RAM

Open dajanaelez opened this issue 10 months ago • 8 comments

CRASH REPORT TERMINAL.pdf

I am sending short report of the crashes I received after all fixing ,- updating I could menage torch 2.2.0 , I use python 3.10.14 in VENV, diffusers updated etc...

dajanaelez avatar Mar 31 '24 11:03 dajanaelez

There appear to be 2 issues affecting Macs (MPS) in kohya 23.0.15:

  1. Different required torch/torchvision packages (see: https://www.reddit.com/r/StableDiffusion/comments/15izfrl/sdxl_lora_training_with_kohya_ss_on_apple_silicon/kvbas5s/)

Steps to resolve appear to be:

  • source venv/bin/activate
  • pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cpu
  • Modify the first line of requirements_macos_arm64.txt to say "torch==2.4.0 torchvision==0.15.0" instead of the current values
  1. sd-scripts/library/train_util.py assumes CUDA exists on the system.

Steps to resolve appear to be:

  • Comment out these two lines at line 4970 of kohya_ss/sd-scripts/library/train_util.py:
    with torch.cuda.device(torch.cuda.current_device()):
        torch.cuda.empty_cache()

That enables me to get far enough to start training without an error or assert being thrown. I haven't been able to evaluate the results of the training yet.

panicsteve avatar Apr 01 '24 17:04 panicsteve

hello,

thank you so much for the scripts and quick answer. Anyway, this seem like it solved a few of problems but the new return initialize different problems now...

On Mon, Apr 1, 2024 at 7:19 PM Steven Frank @.***> wrote:

There appear to be 2 issues affecting Macs (MPS) in kohya 23.0.15:

  1. Different required torch/torchvision packages (see: https://www.reddit.com/r/StableDiffusion/comments/15izfrl/sdxl_lora_training_with_kohya_ss_on_apple_silicon/kvbas5s/ )

Steps to resolve appear to be:

  • source venv/bin/activate
  • pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cpu
  • Modify the first line of requirements_macos_arm64.txt to say "torch==2.4.0 torchvision==0.15.0" instead of the current values
  1. sd-scripts/library/train_util.py assumes CUDA exists on the system.

Steps to resolve appear to be:

  • Comment out these two lines at line 4970 of kohya_ss/sd-scripts/library/train_util.py:
with torch.cuda.device(torch.cuda.current_device()):
    torch.cuda.empty_cache()

That enables me to get far enough to start training without an error or assert being thrown. I haven't been able to evaluate the results of the training yet.

— Reply to this email directly, view it on GitHub https://github.com/bmaltais/kohya_ss/issues/2185#issuecomment-2030188675, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKA5FYS64TZITP7GXBTMZ7LY3GJKLAVCNFSM6AAAAABFQLKD3OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMZQGE4DQNRXGU . You are receiving this because you authored the thread.Message ID: @.***>

-- Dajana Elez

Master of Arts in Architecture, Städelschule, Frankfurt

Architecture Engineer, Technical University, Belgrade Hamburg / Berlin / Belgrade

dajanaelez avatar Apr 02 '24 07:04 dajanaelez

crash user config_ kohya_sskohya_sskohya_ssvenvlibpython3.10… 2.pdf I really hope you could have a Quick Look on the new issue I receive

dajanaelez avatar Apr 09 '24 13:04 dajanaelez

also main message comes with

I keep receiving error trying to train model locally Terminating app due to uncaught exception 'NSInternalInconsistencyException', reason: 'NSWindow should only be instantiated on the main thread!'

dajanaelez avatar Apr 10 '24 06:04 dajanaelez

Hello Did anyone by any chance find a solution? I followed the tip from @panicsteve , but unfortunately, the training process stopped with the following error. "RuntimeError: User specified an unsupported autocast device_type 'mps'" I'm using an M1 Ultra with 128GB Thanks a lot!


22:19:47-865199 INFO     Start training Dreambooth...
22:19:47-866936 INFO     Validating lr scheduler arguments...
22:19:47-868933 INFO     Validating optimizer arguments...
22:19:47-870879 INFO     Validating /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/log existence and writability... SUCCESS
22:19:47-872704 INFO     Validating /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/model existence and writability... SUCCESS
22:19:47-874695 INFO     Validating /Users/spk/Downloads/v1-5-pruned.safetensors existence... SUCCESS
22:19:47-875971 INFO     Validating /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/images existence... SUCCESS
22:19:47-877613 INFO     Folder 20_downloads: 20 repeats found
22:19:47-879296 INFO     Folder 20_downloads: 4 images found
22:19:47-881052 INFO     Folder 20_downloads: 4 * 20 = 80 steps
22:19:47-882344 INFO     Regulatization factor: 1
22:19:47-883423 INFO     Total steps: 80
22:19:47-884603 INFO     Train batch size: 1
22:19:47-885183 INFO     Gradient accumulation steps: 1
22:19:47-885607 INFO     Epoch: 10
22:19:47-885961 INFO     Max train steps: 1600
22:19:47-886327 INFO     lr_warmup_steps = 160
22:19:47-888063 INFO     Saving training config to /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/model/siemens-model-v1_20240512-221947.json...
22:19:47-889541 INFO     Executing command: /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/bin/accelerate launch --dynamo_backend no --dynamo_mode
                         default --mixed_precision fp16 --num_processes 1 --num_machines 1 --num_cpu_threads_per_process 2
                         /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/sd-scripts/train_db.py --config_file
                         /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/model/config_dreambooth-20240512-221947.toml
22:19:47-902822 INFO     Command executed.
/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/diffusers/utils/outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
2024-05-12 22:19:55 WARNING  WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:                                              _cpp_lib.py:144
                                 PyTorch 2.0.0 with CUDA None (you have 2.3.0)
                                 Python  3.10.14 (you have 3.10.14)
                               Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
                               Memory-efficient attention, SwiGLU, sparse and more won't be available.
                               Set XFORMERS_MORE_DETAILS=1 for more details
/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/diffusers/utils/outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
2024-05-12 22:20:03 INFO     Loading settings from                                                                                                         train_util.py:4308
                             /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/model/config_dreambooth-20240512-221947.toml...
                    INFO     /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/model/config_dreambooth-20240512-221947               train_util.py:4327
2024-05-12 22:20:03 INFO     prepare tokenizer                                                                                                             train_util.py:4861
                    INFO     update token length: 75                                                                                                       train_util.py:4884
                    INFO     prepare images.                                                                                                               train_util.py:1848
                    INFO     found directory /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/images/20_downloads contains 4 image  train_util.py:1773
                             files
                    INFO     80 train images with repeating.                                                                                               train_util.py:1891
                    INFO     0 reg images.                                                                                                                 train_util.py:1894
                    WARNING  no regularization images / 正則化画像が見つかりませんでした                                                                   train_util.py:1901
                    INFO     [Dataset 0]                                                                                                                   config_util.py:565
                               batch_size: 1
                               resolution: (512, 512)
                               enable_bucket: True
                               network_multiplier: 1.0
                               min_bucket_reso: 256
                               max_bucket_reso: 2048
                               bucket_reso_steps: 64
                               bucket_no_upscale: True

                               [Subset 0 of Dataset 0]
                                 image_dir: "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/images/20_downloads"
                                 image_count: 4
                                 num_repeats: 20
                                 shuffle_caption: False
                                 keep_tokens: 0
                                 keep_tokens_separator:
                                 secondary_separator: None
                                 enable_wildcard: False
                                 caption_dropout_rate: 0.0
                                 caption_dropout_every_n_epoches: 0
                                 caption_tag_dropout_rate: 0.0
                                 caption_prefix: None
                                 caption_suffix: None
                                 color_aug: False
                                 flip_aug: False
                                 face_crop_aug_range: None
                                 random_crop: False
                                 token_warmup_min: 1,
                                 token_warmup_step: 0,
                                 is_reg: False
                                 class_tokens: downloads
                                 caption_extension: .txt


                    INFO     [Dataset 0]                                                                                                                   config_util.py:571
                    INFO     loading image sizes.                                                                                                           train_util.py:974
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 2165.92it/s]
                    INFO     make buckets                                                                                                                   train_util.py:980
                    WARNING  min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is set, because bucket reso is defined by image size      train_util.py:999
                             automatically /
                             bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計算されるため、min_bucket_resoとmax_bucket_resoは無視
                             されます
                    INFO     number of images (including repeats) / 各bucketの画像枚数(繰り返し回数を含む)                                               train_util.py:1036
                    INFO     bucket 0: resolution (576, 320), count: 20                                                                                    train_util.py:1048
                    INFO     bucket 1: resolution (640, 384), count: 40                                                                                    train_util.py:1048
                    INFO     bucket 2: resolution (768, 320), count: 20                                                                                    train_util.py:1048
                    INFO     mean ar error (without repeats): 0.1001269544878931                                                                           train_util.py:1053
                    INFO     prepare accelerator                                                                                                              train_db.py:106
/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/torch/amp/grad_scaler.py:131: UserWarning: torch.cuda.amp.GradScaler is enabled, but CUDA is not available.  Disabling.
  warnings.warn(
accelerator device: mps
                    INFO     loading model for process 0/1                                                                                                 train_util.py:5053
                    INFO     load StableDiffusion checkpoint: /Users/spk/Downloads/v1-5-pruned.safetensors                                     train_util.py:5000
                    INFO     UNet2DConditionModel: 64, 8, 768, False, False                                                                             original_unet.py:1387
2024-05-12 22:20:10 INFO     loading u-net: <All keys matched successfully>                                                                                model_util.py:1009
                    INFO     loading vae: <All keys matched successfully>                                                                                  model_util.py:1017
2024-05-12 22:20:13 INFO     loading text encoder: <All keys matched successfully>                                                                         model_util.py:1074
                    INFO     Enable xformers for U-Net                                                                                                     train_util.py:3083
                    INFO     [Dataset 0]                                                                                                                   train_util.py:2418
                    INFO     caching latents.                                                                                                              train_util.py:1120
                    INFO     checking cache validity...                                                                                                    train_util.py:1130
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 199728.76it/s]
                    INFO     caching latents...                                                                                                            train_util.py:1171
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:01<00:00,  2.57it/s]
prepare optimizer, data loader etc.
/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cextension.py:34: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
  warn("The installed version of bitsandbytes was compiled without GPU support. "
'NoneType' object has no attribute 'cadam32bit_grad_fp32'
2024-05-12 22:20:17 INFO     use 8-bit AdamW optimizer | {}                                                                                                train_util.py:4463
Traceback (most recent call last):
  File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/sd-scripts/train_db.py", line 529, in <module>
    train(args)
  File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/sd-scripts/train_db.py", line 239, in train
    unet, text_encoder, optimizer, train_dataloader, lr_scheduler = accelerator.prepare(
  File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/accelerate/accelerator.py", line 1213, in prepare
    result = tuple(
  File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/accelerate/accelerator.py", line 1214, in <genexpr>
    self._prepare_one(obj, first_pass=True, device_placement=d) for obj, d in zip(args, device_placement)
  File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/accelerate/accelerator.py", line 1094, in _prepare_one
    return self.prepare_model(obj, device_placement=device_placement)
  File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/accelerate/accelerator.py", line 1280, in prepare_model
    autocast_context = get_mixed_precision_context_manager(self.native_amp, self.autocast_handler)
  File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 1534, in get_mixed_precision_context_manager
    return torch.autocast(device_type=state.device.type, dtype=torch.float16, **autocast_kwargs)
  File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 241, in __init__
    raise RuntimeError(
RuntimeError: User specified an unsupported autocast device_type 'mps'
Traceback (most recent call last):
  File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 47, in main
    args.func(args)
  File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1017, in launch_command
    simple_launcher(args)
  File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 637, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/bin/python', '/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/sd-scripts/train_db.py', '--config_file', '/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/model/config_dreambooth-20240512-221947.toml']' returned non-zero exit status 1.
22:20:18-195206 INFO     Training has ended.

spyroskotsakis avatar May 12 '24 20:05 spyroskotsakis

Hello,

No I didn’t, actually, it connects and disconnected like - through someone’s analog work, wird … and I don’t think I will use it locally yet. Didn’t like how it affected new hardware.

Dajana Elez www.atelier-d99.com

Master of Arts in Architecture, Städelschule Frankfurt Architecture Engineer, Technical University Belgrade

Hamburg / Berlin /Belgrade

On Sun 12. May 2024 at 22:29, spyroskotsakis @.***> wrote:

Hello Did anyone by any chance find a solution? I followed the tip from @panicsteve https://github.com/panicsteve , but unfortunately, the training process stopped with the following error. "RuntimeError: User specified an unsupported autocast device_type 'mps'" I'm using an M1 Ultra with 128GB Thanks a lot!

22:19:47-865199 INFO Start training Dreambooth... 22:19:47-866936 INFO Validating lr scheduler arguments... 22:19:47-868933 INFO Validating optimizer arguments... 22:19:47-870879 INFO Validating /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/log existence and writability... SUCCESS 22:19:47-872704 INFO Validating /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/model existence and writability... SUCCESS 22:19:47-874695 INFO Validating /Users/spk/Downloads/v1-5-pruned.safetensors existence... SUCCESS 22:19:47-875971 INFO Validating /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/images existence... SUCCESS 22:19:47-877613 INFO Folder 20_downloads: 20 repeats found 22:19:47-879296 INFO Folder 20_downloads: 4 images found 22:19:47-881052 INFO Folder 20_downloads: 4 * 20 = 80 steps 22:19:47-882344 INFO Regulatization factor: 1 22:19:47-883423 INFO Total steps: 80 22:19:47-884603 INFO Train batch size: 1 22:19:47-885183 INFO Gradient accumulation steps: 1 22:19:47-885607 INFO Epoch: 10 22:19:47-885961 INFO Max train steps: 1600 22:19:47-886327 INFO lr_warmup_steps = 160 22:19:47-888063 INFO Saving training config to /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/model/siemens-model-v1_20240512-221947.json... 22:19:47-889541 INFO Executing command: /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/bin/accelerate launch --dynamo_backend no --dynamo_mode default --mixed_precision fp16 --num_processes 1 --num_machines 1 --num_cpu_threads_per_process 2 /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/sd-scripts/train_db.py --config_file /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/model/config_dreambooth-20240512-221947.toml 22:19:47-902822 INFO Command executed. /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/diffusers/utils/outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. torch.utils._pytree._register_pytree_node( 2024-05-12 22:19:55 WARNING WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for: _cpp_lib.py:144 PyTorch 2.0.0 with CUDA None (you have 2.3.0) Python 3.10.14 (you have 3.10.14) Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers) Memory-efficient attention, SwiGLU, sparse and more won't be available. Set XFORMERS_MORE_DETAILS=1 for more details /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/diffusers/utils/outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. torch.utils._pytree._register_pytree_node( 2024-05-12 22:20:03 INFO Loading settings from train_util.py:4308 /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/model/config_dreambooth-20240512-221947.toml... INFO /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/model/config_dreambooth-20240512-221947 train_util.py:4327 2024-05-12 22:20:03 INFO prepare tokenizer train_util.py:4861 INFO update token length: 75 train_util.py:4884 INFO prepare images. train_util.py:1848 INFO found directory /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/images/20_downloads contains 4 image train_util.py:1773 files INFO 80 train images with repeating. train_util.py:1891 INFO 0 reg images. train_util.py:1894 WARNING no regularization images / 正則化画像が見つかりませんでした train_util.py:1901 INFO [Dataset 0] config_util.py:565 batch_size: 1 resolution: (512, 512) enable_bucket: True network_multiplier: 1.0 min_bucket_reso: 256 max_bucket_reso: 2048 bucket_reso_steps: 64 bucket_no_upscale: True

                           [Subset 0 of Dataset 0]
                             image_dir: "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/images/20_downloads"
                             image_count: 4
                             num_repeats: 20
                             shuffle_caption: False
                             keep_tokens: 0
                             keep_tokens_separator:
                             secondary_separator: None
                             enable_wildcard: False
                             caption_dropout_rate: 0.0
                             caption_dropout_every_n_epoches: 0
                             caption_tag_dropout_rate: 0.0
                             caption_prefix: None
                             caption_suffix: None
                             color_aug: False
                             flip_aug: False
                             face_crop_aug_range: None
                             random_crop: False
                             token_warmup_min: 1,
                             token_warmup_step: 0,
                             is_reg: False
                             class_tokens: downloads
                             caption_extension: .txt


                INFO     [Dataset 0]                                                                                                                   config_util.py:571
                INFO     loading image sizes.                                                                                                           train_util.py:974

100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 2165.92it/s] INFO make buckets train_util.py:980 WARNING min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is set, because bucket reso is defined by image size train_util.py:999 automatically / bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計算されるため、min_bucket_resoとmax_bucket_resoは無視 されます INFO number of images (including repeats) / 各bucketの画像枚数(繰り返し回数を含む) train_util.py:1036 INFO bucket 0: resolution (576, 320), count: 20 train_util.py:1048 INFO bucket 1: resolution (640, 384), count: 40 train_util.py:1048 INFO bucket 2: resolution (768, 320), count: 20 train_util.py:1048 INFO mean ar error (without repeats): 0.1001269544878931 train_util.py:1053 INFO prepare accelerator train_db.py:106 /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/torch/amp/grad_scaler.py:131: UserWarning: torch.cuda.amp.GradScaler is enabled, but CUDA is not available. Disabling. warnings.warn( accelerator device: mps INFO loading model for process 0/1 train_util.py:5053 INFO load StableDiffusion checkpoint: /Users/spk/Downloads/v1-5-pruned.safetensors train_util.py:5000 INFO UNet2DConditionModel: 64, 8, 768, False, False original_unet.py:1387 2024-05-12 22:20:10 INFO loading u-net: <All keys matched successfully> model_util.py:1009 INFO loading vae: <All keys matched successfully> model_util.py:1017 2024-05-12 22:20:13 INFO loading text encoder: <All keys matched successfully> model_util.py:1074 INFO Enable xformers for U-Net train_util.py:3083 INFO [Dataset 0] train_util.py:2418 INFO caching latents. train_util.py:1120 INFO checking cache validity... train_util.py:1130 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 199728.76it/s] INFO caching latents... train_util.py:1171 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:01<00:00, 2.57it/s] prepare optimizer, data loader etc. /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cextension.py:34: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable. warn("The installed version of bitsandbytes was compiled without GPU support. " 'NoneType' object has no attribute 'cadam32bit_grad_fp32' 2024-05-12 22:20:17 INFO use 8-bit AdamW optimizer | {} train_util.py:4463 Traceback (most recent call last): File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/sd-scripts/train_db.py", line 529, in train(args) File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/sd-scripts/train_db.py", line 239, in train unet, text_encoder, optimizer, train_dataloader, lr_scheduler = accelerator.prepare( File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/accelerate/accelerator.py", line 1213, in prepare result = tuple( File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/accelerate/accelerator.py", line 1214, in self._prepare_one(obj, first_pass=True, device_placement=d) for obj, d in zip(args, device_placement) File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/accelerate/accelerator.py", line 1094, in _prepare_one return self.prepare_model(obj, device_placement=device_placement) File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/accelerate/accelerator.py", line 1280, in prepare_model autocast_context = get_mixed_precision_context_manager(self.native_amp, self.autocast_handler) File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 1534, in get_mixed_precision_context_manager return torch.autocast(device_type=state.device.type, dtype=torch.float16, **autocast_kwargs) File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 241, in init raise RuntimeError( RuntimeError: User specified an unsupported autocast device_type 'mps' Traceback (most recent call last): File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/bin/accelerate", line 8, in sys.exit(main()) File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 47, in main args.func(args) File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1017, in launch_command simple_launcher(args) File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 637, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/bin/python', '/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/sd-scripts/train_db.py', '--config_file', '/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/model/config_dreambooth-20240512-221947.toml']' returned non-zero exit status 1. 22:20:18-195206 INFO Training has ended.

— Reply to this email directly, view it on GitHub https://github.com/bmaltais/kohya_ss/issues/2185#issuecomment-2106365993, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKA5FYSZITYRNB5WJDQSI53ZB7GKNAVCNFSM6AAAAABFQLKD3OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBWGM3DKOJZGM . You are receiving this because you authored the thread.Message ID: @.***>

dajanaelez avatar May 12 '24 20:05 dajanaelez

Hi @dajanaelez , Also running an M3 Max and made a little headway. Not sure if this will help anyone else - but let me know if there are other settings and things that you need me to check!

It's at least now running to create the sample images. It's not close to the character I was trying to create. But running.

Running Python: Python 3.10.14

(Did delete the venv folder, and re-activate everything after making changes to things like the requirements_macos_arm64.txt file)

requirements_macos_arm64.txt

torch>=2.4.0 torchvision>=0.15.0
bitsandbytes==0.41.1
tensorflow-macos tensorflow-metal tensorboard==2.14.1
onnxruntime==1.17.1
-r requirements.txt

Requirements.txt

accelerate==0.25.0
aiofiles==23.2.1
altair==4.2.2
dadaptation==3.1
diffusers[torch]==0.25.0
easygui==0.98.3
einops==0.7.0
fairscale==0.4.13
ftfy==6.1.1
gradio==4.43.0
huggingface-hub==0.20.1
imagesize==1.4.1
invisible-watermark==0.2.0
lion-pytorch==0.0.6
lycoris_lora==2.2.0.post3
omegaconf==2.3.0
onnx==1.16.1
prodigyopt==1.0
protobuf==3.20.3
open-clip-torch==2.20.0
opencv-python==4.7.0.68
prodigyopt==1.0
pytorch-lightning==1.9.0
rich>=13.7.1
safetensors==0.4.2
scipy==1.11.4
timm==0.6.12
tk==0.1.0
toml==0.10.2
transformers==4.38.0
voluptuous==0.13.1
wandb==0.15.11
scipy==1.11.4
# for kohya_ss library
-e ./sd-scripts # no_verify leave this to specify not checking this a verification stage

Here are the saved settings file (update things like your model name and username of course if you use this) SettingsToShare.json

JoeyOverby avatar Sep 07 '24 20:09 JoeyOverby

I fixed another issue with Mac, and put the sample code into the other issue here: https://github.com/bmaltais/kohya_ss/issues/2805#issuecomment-2349652539

JoeyOverby avatar Sep 13 '24 17:09 JoeyOverby