kohya_ss icon indicating copy to clipboard operation
kohya_ss copied to clipboard

Training brakes from "A matching Triton is not available, some optimizations will not be enabled. Error caught was: No module named 'triton" even when there are no triton dependent options enabled

Open MarineBirch723 opened this issue 11 months ago • 15 comments

MarineBirch723 avatar Mar 11 '24 18:03 MarineBirch723

THis is normal on windows. Triton report errors but does not cause any issues.

bmaltais avatar Mar 11 '24 18:03 bmaltais

Except that it does for me. It keeps queuing the massage therefore it doesn't start training.

MarineBirch723 avatar Mar 11 '24 19:03 MarineBirch723

Humm, try training one of the test config under test/config

for example try the standard adamw8bit one

does it work?

bmaltais avatar Mar 11 '24 19:03 bmaltais

@MarineBirch723 It will give that error once at the start and then like 8 times consecutively before finally starting training. Did you wait till then?

iamrohitanshu avatar Mar 13 '24 06:03 iamrohitanshu

For me it repeats 24 times (the amount I've given for max data loader). Then it starts training. Correlation between those two is beyond my knowledge.

What's typical koyha_ss behaviour though: I have Triton installed and works flawlessly with other tests in the venv specific for koyha_ss.

user83922 avatar Mar 15 '24 14:03 user83922

I always set max data loader to avoid this annoying behaviour

bmaltais avatar Mar 15 '24 16:03 bmaltais

@user83922 Yes, I have data loader set as 8. So, that's the reason. Good to know, Thanks! @bmaltais What do you set it to? 1? or 0? Or are you saying you set it in some other way?

iamrohitanshu avatar Mar 15 '24 18:03 iamrohitanshu

@iamrohitanshu simply set it to 0

bmaltais avatar Mar 15 '24 20:03 bmaltais

@bmaltais "ValueError: persistent_workers option needs num_workers > 0" So I had to set it to 1.

iamrohitanshu avatar Mar 16 '24 08:03 iamrohitanshu

I'm on Windows 11 with python3.10 and I downloaded triton-2.1.0-cp310-cp310-win_amd64.whl from https://huggingface.co/Rodeszones/CogVLM-grounding-generalist-hf-quant4/tree/main

Went into the Kohya\venv\scripts folder, pasted the file there because I'm lazy, renamed one of the pips to something unique because I always forget how to activate a local venv (global python installs always mess things up in cmd), ran pip1234 install triton-2.1.0-cp310-cp310-win_amd64.whl and the error is gone now.

RandomGitUser321 avatar Mar 19 '24 09:03 RandomGitUser321

@RandomGitUser321 Does it offer any speed or memory benefits to you?

iamrohitanshu avatar Mar 19 '24 09:03 iamrohitanshu

@iamrohitanshu No, not that I can see. I made a duplicate installation of kohya where I installed triton for it, then reran the setup.bat just to make sure there weren't any steps in there that would branch based on it being there or not. Still seeing the same numbers across the board for VRAM and it/s.

At first I thought it made stuff worse, but it turned out that there are some buggy issues with reloading configs. I ended up testing by just running the accererate command line wall of text for both versions of the installation and saw pretty much identical results.

RandomGitUser321 avatar Mar 19 '24 10:03 RandomGitUser321

@RandomGitUser321 Thank you for sharing your experience. I was thinking of installing Triton on Windows but I see it's not useful. May be it helps on Linux only.

iamrohitanshu avatar Mar 19 '24 18:03 iamrohitanshu

I have added built-in support to install Triton 2.1.0 for windows in the setup.bat menu. Look in the dev branch or wait for the next release.

bmaltais avatar Mar 19 '24 23:03 bmaltais

The same thing is happening to me, do you know how to solve it?

waifuista avatar Mar 23 '24 22:03 waifuista