ai-toolkit icon indicating copy to clipboard operation
ai-toolkit copied to clipboard

No Progress

Open ali-m-github opened this issue 1 month ago • 13 comments

Hello, I have installed the toolkit but when I click start it says starting but nothing happens. Left it for over 1 hour with no progress and nothing showing up in the progress window.

ali-m-github avatar Oct 25 '25 20:10 ali-m-github

Same here. Additionally queue management doesn't seem to work. Whenever I try to stop a job it just enters a loop, printing '[WORKER] Stopping job [some ID] on GPU(s) 0'.

woyzeck avatar Oct 25 '25 23:10 woyzeck

I have the same issue. 4090, 64gb ram. Nothing shows up in progress, but in the past it worked fine. I recently updated and tried a fresh install as well.

jeffufu avatar Oct 26 '25 06:10 jeffufu

I have deleted then reinstalled using their easy installer and it worked even though the first time it did not so give it a try

ali-m-github avatar Oct 26 '25 10:10 ali-m-github

I am also facing the same issue. It only says Starting Job and shows no progress.

rsshekhawat avatar Oct 26 '25 14:10 rsshekhawat

Same here on RTX 6000 Pro. Any solutions?

dsf12345 avatar Oct 26 '25 15:10 dsf12345

For me the python run.py .../.job_config.json get oom silently and stop the process, it's probably a bug with newer versions, look at this comment https://github.com/ostris/ai-toolkit/issues/457#issuecomment-3393679558

kesslerdev avatar Oct 26 '25 21:10 kesslerdev

use venv environment, not conda environment, then the progress works

suxici avatar Oct 31 '25 05:10 suxici

Image Image not sure whats the problem - i opened claude on the directory and it fixed it https://gist.github.com/johndpope/4828f5cc18106b06a55d7a365c5e8f0a

i press it further -

  Starting a job only queues it, but doesn't automatically start the queue. You need to manually start the queue through the UI.

  To Prevent This in the Future:

  There should be a "Start Queue" button in the UI at: http://192.168.1.101:8675/jobs

i push hotfix here - https://github.com/johndpope/ai-toolkit/commit/fd55becb57bb7d6d69e69e07fd95df802e4394e2

johndpope avatar Nov 02 '25 11:11 johndpope

SAME ISSUE

Edit: The issue is the requirements are not being installed properly to the venv's (or python-embeded) Lib\site-packages folder. Even activating the venv before doing installs doesn't seem to guarantee that things will get installed inside the venv. Use whatever LLM you want to help you --target pip install's the python dependencies to the correct folder.

LucianoCirino avatar Nov 22 '25 06:11 LucianoCirino

To troubleshoot the exact issue you can use the following url to view the logs

http://localhost:8675/api/jobs//log

in my case the cuda and torch libraries were not installed correctly

kknightowl avatar Dec 12 '25 01:12 kknightowl

Admit it guys. Like me, most of y'all forgot to do the pip-install on requirements.txt and tried to run the training. I feel so stupid sometimes 🤣

cohan8999 avatar Dec 13 '25 21:12 cohan8999

Admit it guys. Like me, most of y'all forgot to do the pip-install on requirements.txt and tried to run the training. I feel so stupid sometimes 🤣

That's it. I installed AI Toolkit yesterday and forgot this simple step. Thanks

appliedintelligencelab avatar Dec 16 '25 18:12 appliedintelligencelab