kohya_ss icon indicating copy to clipboard operation
kohya_ss copied to clipboard

Can be support Mac Os(M1)

Open dbk1985 opened this issue 1 year ago • 8 comments

dbk1985 avatar Mar 08 '23 07:03 dbk1985

If someone with a Mac can provide a PR to make it work under M1 Mac's I could integrate it. Like I did for Linux. The biggest challenge will be to make the kohya sd-scriot repo support M1 Mac's... This is not something I can do. So until kohya add M1 support I won't be able to support it.

bmaltais avatar Mar 08 '23 11:03 bmaltais

I try to run the script in my Macbook Pro M1, first of all, my macbook have no java. after I install java and run again below is the result,

bash ubuntu_setup.sh installing tk Password: The operation couldn’t be completed. Unable to locate a Java Runtime. Please visit http://www.java.com for information on installing Java.

Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cu116 ERROR: Could not find a version that satisfies the requirement torch==1.12.1+cu116 (from versions: 1.11.0, 1.12.0, 1.12.1, 1.13.0, 1.13.1) ERROR: No matching distribution found for torch==1.12.1+cu116

[notice] A new release of pip available: 22.3.1 -> 23.0.1 [notice] To update, run: pip install --upgrade pip Processing /Users/wesleychong21/AI/kohya_ss Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Collecting accelerate==0.15.0 Using cached accelerate-0.15.0-py3-none-any.whl (191 kB) Collecting albumentations==1.3.0 Using cached albumentations-1.3.0-py3-none-any.whl (123 kB) Collecting altair==4.2.2 Using cached altair-4.2.2-py3-none-any.whl (813 kB) Collecting bitsandbytes==0.35.0 Using cached bitsandbytes-0.35.0-py3-none-any.whl (62.5 MB) Collecting dadaptation==1.5 Using cached dadaptation-1.5.tar.gz (8.3 kB) Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Collecting diffusers[torch]==0.10.2 Using cached diffusers-0.10.2-py3-none-any.whl (503 kB) Collecting easygui==0.98.3 Using cached easygui-0.98.3-py2.py3-none-any.whl (92 kB) Collecting einops==0.6.0 Using cached einops-0.6.0-py3-none-any.whl (41 kB) Collecting ftfy==6.1.1 Using cached ftfy-6.1.1-py3-none-any.whl (53 kB) Collecting gradio==3.19.1 Using cached gradio-3.19.1-py3-none-any.whl (14.2 MB) Collecting lion-pytorch==0.0.6 Using cached lion_pytorch-0.0.6-py3-none-any.whl (4.2 kB) Collecting opencv-python==4.7.0.68 Using cached opencv_python-4.7.0.68-cp37-abi3-macosx_11_0_arm64.whl (31.1 MB) Collecting pytorch-lightning==1.9.0 Using cached pytorch_lightning-1.9.0-py3-none-any.whl (825 kB) Collecting safetensors==0.2.6 Using cached safetensors-0.2.6-cp310-cp310-macosx_12_0_arm64.whl (389 kB) Collecting tensorboard==2.10.1 Using cached tensorboard-2.10.1-py3-none-any.whl (5.9 MB) Collecting tk==0.1.0 Using cached tk-0.1.0-py3-none-any.whl (3.9 kB) Collecting transformers==4.26.0 Using cached transformers-4.26.0-py3-none-any.whl (6.3 MB) Collecting fairscale==0.4.13 Using cached fairscale-0.4.13.tar.gz (266 kB) Installing build dependencies ... done Getting requirements to build wheel ... done Installing backend dependencies ... done Preparing metadata (pyproject.toml) ... done Collecting requests==2.28.2 Using cached requests-2.28.2-py3-none-any.whl (62 kB) Collecting timm==0.6.12 Using cached timm-0.6.12-py3-none-any.whl (549 kB) Collecting huggingface-hub==0.12.0 Using cached huggingface_hub-0.12.0-py3-none-any.whl (190 kB) ERROR: Could not find a version that satisfies the requirement tensorflow==2.10.1 (from versions: none) ERROR: No matching distribution found for tensorflow==2.10.1

[notice] A new release of pip available: 22.3.1 -> 23.0.1 [notice] To update, run: pip install --upgrade pip ERROR: xformers-0.0.14.dev0-cp310-cp310-linux_x86_64.whl is not a supported wheel on this platform.

[notice] A new release of pip available: 22.3.1 -> 23.0.1 [notice] To update, run: pip install --upgrade pip --------------------------------------------------------------------------------In which compute environment are you running? Please select a choice using the arrow or number keys, and selecting with enter ➔ This machine AWS (Amazon SageMaker)

wesleychong21 avatar Mar 08 '23 12:03 wesleychong21

this is my setup: In which compute environment are you running? This machine
--------------------------------------------------------------------------------Which type of machine are you using?
No distributed training
Do you want to run your training on CPU only (even if a GPU is available)? [yes/NO]:no
Do you wish to optimize your script with torch dynamo?[yes/NO]:no
Do you want to use DeepSpeed? [yes/NO]: no
What GPU(s) (by id) should be used for training on this machine as a comma-seperated list? [all]: --------------------------------------------------------------------------------Do you wish to use FP16 or BF16 (mixed precision)? fp16
accelerate configuration saved at /Users/wesleychong21/.cache/huggingface/accelerate/default_config.yaml
setup finished! run \e[0;92m./gui.sh\e[0m to start

wesleychong21 avatar Mar 08 '23 12:03 wesleychong21

after that, I run ./gui.sh , below is the result

Traceback (most recent call last): File "/Users/wesleychong21/AI/kohya_ss/kohya_gui.py", line 1, in import gradio as gr ModuleNotFoundError: No module named 'gradio'

after, I run pip install gradio . the error still same

wesleychong21 avatar Mar 08 '23 12:03 wesleychong21

after that, I run ./gui.sh , below is the result

Traceback (most recent call last): File "/Users/wesleychong21/AI/kohya_ss/kohya_gui.py", line 1, in import gradio as gr ModuleNotFoundError: No module named 'gradio'

after, I run pip install gradio . the error still same

I get the same issue. pip list shows that gradio is installed. But ./ubuntu_setup.sh does not work on macOS M1.

Ryan-Haines avatar Mar 12 '23 19:03 Ryan-Haines

I haven't had the time to deep dive this, but I did get a script going to get to the point that that GUI comes up on my M2 MacBook. Kohya macOS Setup script and requirements.txt. The GUI comes up, however, any button I press will lead to an NSWindow error. I believe this stems from windows being created in Gradio or TKinter outside of the main thread which macOS does not allow for. I'm not sure if that's the exact problem, but if I'm right that could be a decent refactoring operation to avoid that.

Exact error is: Python[52609:9631664] *** Terminating app due to uncaught exception 'NSInternalInconsistencyException', reason: 'NSWindow drag regions should only be invalidated on the Main Thread!'

jstayco avatar Mar 24 '23 16:03 jstayco

I haven't had the time to deep dive this, but I did get a script going to get to the point that that GUI comes up on my M2 MacBook. Kohya macOS Setup script and requirements.txt. The GUI comes up, however, any button I press will lead to an NSWindow error. I believe this stems from windows being created in Gradio or TKinter outside of the main thread which macOS does not allow for. I'm not sure if that's the exact problem, but if I'm right that could be a decent refactoring operation to avoid that.

Exact error is: Python[52609:9631664] *** Terminating app due to uncaught exception 'NSInternalInconsistencyException', reason: 'NSWindow drag regions should only be invalidated on the Main Thread!'

I have the exact same problem when wanting to select a directory. Hopefully we can find a fix For now I just enter the directory path as text manually instead of clicking the "select directory" button. seems to be working

cannuri avatar Mar 26 '23 15:03 cannuri

NSInternalInconsistencyException', reason: 'NSWindow drag regions should only be invalidated on the Main Thread!'

same problem for me, if i click on "train model" the gui crash on m1

nbdy-coder avatar Mar 26 '23 15:03 nbdy-coder

Seems like training on Apple Silicon is not possible. The script is looking for CUDA

CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
/Users/tonic/Nerd/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/paths.py:27: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/usr/local/cuda/lib64')}
  warn(
WARNING: No libcudart.so found! Install CUDA or the cudatoolkit package (anaconda)!

cannuri avatar Mar 26 '23 19:03 cannuri

CUDA

did you try to install cuda ? is it working ?

nbdy-coder avatar Mar 26 '23 19:03 nbdy-coder

There is now an implementation of MacOS support. I can't test it myself but I integrated someone else work about this in the repo. Look at the latest release for support.

bmaltais avatar Mar 26 '23 22:03 bmaltais

@bmaltais Lookin pretty good for some early support! Just throwing in there, you should be able to merge the requirements.txt without blowing up all your previous setups. That's why I threw the sys_platform catches there to ensure that the extra additions only applied to macOS on the command pip -r requirements.txt. You should be able to merge those free of charge, so you just have to maintain one file. Your gui.sh files look like they can be easily merged together as well using the pattern:

from sys import platform
if platform == "linux":
    # Do the Linux stuff
elif platform == "darwin":
    # Do the macOS stuff
elif platform == "win32" or  platform == "cygwin":
    # Do the Windows stuff

Highly recommend doing these merges so you only have to track 3 files: a gui.sh, requirements.txt, and setup.sh. Let me know if you don't have the time and I'll tackle that when I can.

jstayco avatar Mar 26 '23 23:03 jstayco

Oh, I also gave it a quick test spin and the buttons are still crashing on macOS per the above. I'm now 99% sure it's because Gradio is calling TKinter and that's spawning windows outside the main thread which macOS doesn't like. With that in mind, may want to throw in a big warning to users like "Pressing any buttons that spawn windows currently crash the server on macOS!"

jstayco avatar Mar 26 '23 23:03 jstayco

I get this error "Could not initialize NNPACK! Reason: Unsupported hardware." wanted to test it out

maybe I need MPS though

skein12 avatar Mar 27 '23 10:03 skein12

Oh, I also gave it a quick test spin and the buttons are still crashing on macOS per the above. I'm now 99% sure it's because Gradio is calling TKinter and that's spawning windows outside the main thread which macOS doesn't like. With that in mind, may want to throw in a big warning to users like "Pressing any buttons that spawn windows currently crash the server on macOS!"

Yes, it is because of TKinter not running in main thread. I tested it and it is true that it's a thread problem. But that should be an easy fix, no?

Don't forget: you can still just type in the directory path manually without using the buttons. it works fine.

cannuri avatar Mar 27 '23 17:03 cannuri

CUDA

did you try to install cuda ? is it working ?

No, I have an Apple with M2 Max chip inside. CUDA is for Nvidia GPUs only.

cannuri avatar Mar 27 '23 17:03 cannuri

There is now an implementation of MacOS support. I can't test it myself but I integrated someone else work about this in the repo. Look at the latest release for support.

I tried training with the latest version yesterday after the last commit, but I still get the error regarding CUDA. We need to find a way to use Apple's GPUs for Training instead of CUDA. But I am no expert in that field...

cannuri avatar Mar 27 '23 17:03 cannuri

CUDA

did you try to install cuda ? is it working ?

No, I have an Apple with M2 Max chip inside. CUDA is for Nvidia GPUs only.

I'm consolidating that work for the installation stuff right now to unify all the platforms (PR incoming soon there). Then I plan on tackling the thread issue (or at least attempting). If I can get to the point where the buttons work, then I'll move to the next problem I find which I have no doubt will be that CUDA bit. At that point? I'm not expert either, so I'm not sure how much work is involved there. I'm hoping it's as simple as some platform detection and device selection. If it's deeper, then that will take time.

jstayco avatar Mar 27 '23 18:03 jstayco

Alright folks, I'm trying my best to get a filepicker up and running that runs in the same thread and doesn't use TK. I tried all manner of things to get TK going without any major code changes, but nothing seemed to work. I tried multithreading and joining the threads, same with processes, and even tried subprocessing it. TK threw the error every time. I am now trying to come up with a different solution entirely that will replace the file dialog, but without breaking any existing infrastructure. I'm running into challenges as I'm not the best at webdev and I've never messed with Gradio, but I'm trying. For those interested the exact issue are these lines of code here: https://github.com/bmaltais/kohya_ss/blob/master/library/common_gui.py#L129-L146

To be clear, there is nothing wrong with the existing implementation at all. The maintainer wasn't aiming at a macOS target, so TK works fine for all other use cases. It just so happens Apple implemented tighter security restrictions on window spawning and destroying controls.

jstayco avatar Mar 28 '23 01:03 jstayco

So, an update for you all. I ended up getting a file picker working, but I was only able to do that in two ways.

  1. A traditional upload dialog using custom Gradio component + HTML/JS.
  2. A built-in Gradio File upload component.

However, both of those methods required actually uploading the file(s). While that's ok for a config file, I don't think you all want to be copying 5gb+ of models or anything like that every time you point to a file. The TK one works because it is native to the system and has direct access to the local system file whereas your web browser does not (by design). I'm now considering other options since it's largely cosmetic. I'm wondering if I can detect client operating system and hide the broken buttons or something like that. I have to give it more thought because keeping broken buttons in there is just confusing and a poor experience on macOS.

jstayco avatar Mar 30 '23 02:03 jstayco

I got the file dialogs to work! This is an extremely early WIP with a ton of refactoring and work to go. Seriously - it's broken. Do not use it in production under any circumstances at this time. However, if you would be so kind as to test it to verify functionality, that would be amazing!

I've done a ton here, so lots of moving parts under the hood. BROKEN You should, however, now be able to do a few things:

  1. Press the "Open" button on the first Dreambooth configuration tab and see that file dialog. It should filter only to JSON files.
  2. Select an invalid file and see a nicer pop-up than the older one telling you it is in invalid file.
  3. I do not have a valid JSON config file, so if someone could verify that loads as normal (and it should since I didn't touch that) that would fantastic.
  4. The file dialogs and message boxes are now separate processes than the main python files and communicate back and forth. If you're interested in the code, the places to look are library/common_gui.py and library/gui_subprocesses.py.

HUGE WIP branch: https://github.com/jstayco/kohya_ss/tree/macos_gui

jstayco avatar Mar 30 '23 08:03 jstayco

It looks fine! However, it cannot load configuration file in Dreambooth Lora. Anyway it can do some work on macOS now. Thanks for your work!

UCPHszf avatar Mar 30 '23 09:03 UCPHszf

@UCPHszf Would you be willing to share your valid configuration with me? Or a generic version of it?

I've never used Kohya_ss before or trained a LoRA model because I only have this MacBook to work with, so I'm doing this the hard way.

jstayco avatar Mar 30 '23 15:03 jstayco

Isn't it just as simple as changing "cuda" to "cpu" or "mps" for it to run on Apple Silicon. I'm just not sure where all those changes would have to be made.

@bmaltais Lookin pretty good for some early support! Just throwing in there, you should be able to merge the requirements.txt without blowing up all your previous setups. That's why I threw the sys_platform catches there to ensure that the extra additions only applied to macOS on the command pip -r requirements.txt. You should be able to merge those free of charge, so you just have to maintain one file. Your gui.sh files look like they can be easily merged together as well using the pattern:

from sys import platform
if platform == "linux":
    # Do the Linux stuff
elif platform == "darwin":
    # Do the macOS stuff
elif platform == "win32" or  platform == "cygwin":
    # Do the Windows stuff

Highly recommend doing these merges so you only have to track 3 files: a gui.sh, requirements.txt, and setup.sh. Let me know if you don't have the time and I'll tackle that when I can.

witfyl-ravped avatar Apr 02 '23 18:04 witfyl-ravped

@witfyl-ravped To a point, I think so. I'm pretty sure the vast majority of this macOS support is the setup scripts and getting the GUI portions to function cross-platform. After that? I think I really do just have to patch some device selection logic and that's it. Or at least, I'm hoping as much.

jstayco avatar Apr 02 '23 19:04 jstayco

@jstayco are you on the team? Is that something you guys can add?

I've tried doing it myself on a few similar repos and haven't quite figured it out yet. Seems so simple in theory though.

witfyl-ravped avatar Apr 02 '23 19:04 witfyl-ravped

@witfyl-ravped I'm not on a team (@bmaltais Yet? Where do I apply!? :P). To your question though: that's my eventual goal. I haven't even trained a LoRA model ever. This all started because I wanted to learn how to do that and every tutorial said "use kohya_ss on Windows". But I've got a Mac. So here I am trying to make it all work. Turns out I'm not the only one, so I'm trying my best!

jstayco avatar Apr 02 '23 19:04 jstayco

Interesting, yeah I've been able to run AUTOMATIC1111 just fine on my Apple Silicon. Haven't dug in but I assume Kohya could implement the same thing.

witfyl-ravped avatar Apr 02 '23 20:04 witfyl-ravped

I have same problem on MBP with M2 chip with error Could not find a version that satisfies the requirement tensorflow==2.10.1 (from versions: none), after some search, I just successfully installed, the key point is to prepare environments before install.

Follow this link https://developer.apple.com/metal/tensorflow-plugin/ to install some deps, then exec ./setup.sh, now the missing tensorflow-macos and other packages should be installed without error.

rugbbyli avatar Apr 07 '23 02:04 rugbbyli

Should be tensorflow-macos and tensorflow-metal for the packages to replace tensorflow in requirements.txt. Already included in new WIP installer .

replace the one tensorflow line in requirements.txt with these 3: tensorflow==2.10.1; sys_platform != 'darwin' tensorflow-macos==2.12.0; sys_platform == 'darwin' tensorflow-metal==0.8.0; sys_platform == 'darwin'

jstayco avatar Apr 07 '23 02:04 jstayco