Training stucked, keep starting, no task found
My training is stuck here. It keeps showing "running," but no task processes can be found when checking the GPU, there is no GPU utilization, and there are no task error messages. Can someone help me? Thanks so much.
me too!
The same thing happened to me. In a desperate move, I installed everything manually (on Windows) and followed the steps, and now I was able to do my first training. I think the error happens because some of the pip install commands were executed outside the virtual environment — meaning that "python -m venv venv" and ".\venv\Scripts\activate" are very important for it to work
git clone https://github.com/ostris/ai-toolkit.git
cd ai-toolkit
python -m venv venv
.\venv\Scripts\activate
pip install --no-cache-dir torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/cu126
pip install -r requirements.txt
You should try in a new directory and install everything manually step by step…
Had the same issue and i just reinstaled the packages with (make sure to run as admin):
python -m venv venv
.\venv\Scripts\activate
pip install --no-cache-dir torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/cu126
pip install -r requirements.txt
and then in a new terminal (not as admin):
cd ui
npm run build_and_start
and now it works perfectly fine
I ran into this issue after i moved the ai-toolkit folder to a different location.
Same problem here. Nothing happens. Just... Nothing :/ Will try installing manually to see if that works better. At least I have a better overview of what's installed and where then...
Yepp... Installing manually worked. Started training right away. DON'T USE THE AUTOMATIC INSTALL!
Yes, manual works, thank you!
I have the same problems but manual installing does not work. So I try something else and it works this time. Delete the venv folder. Delete old Python (mine was 3.13) and then reinstalling with Python 3.12 Then run this code
python -m venv venv .\venv\Scripts\activate pip install --no-cache-dir torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/cu126 pip install -r requirements.txt
Manually cloning the models from hugging face, and hardcode the path to the model/adapter worked for me to bypass this problem. Seems like the default installer is having issue fetching the models from hugging face.
I am also stuck there. after hitting play to start the queue nothing happens
Mine start and an empty python terminal appears and disappears. Still on infinite "Starting job...". Tried all the tips suggested on this issue :C