ai-toolkit icon indicating copy to clipboard operation
ai-toolkit copied to clipboard

Cannot run trainer offline

Open samsonsite1 opened this issue 2 months ago • 6 comments

This is for bugs only

Did you already ask in the discord? No

You verified that this is a bug and not a feature request or question by asking in the discord? No

Describe the bug

I cannot run the trainer script offline. It constantly tries to connect to huggingface.co looking for transformer/config.json. I already have this file stored/cached on my computer. I have successfully run the script once while connected, and all necessary files were downloaded, including transformer/config.json. I then went offline, and tried it again, but the script still tries to connect. It shouldn't do this.

Later, in the script, it then tries to download: /Qwen/Qwen-Image/resolve/main/tokenizer/tokenizer_config.json /Qwen/Qwen-Image/resolve/main/vae/config.json

These files are already stored/cached on my computer.

python run.py config\examples\train_lora_qwen_image_24gb.yaml

ERROR LOG:

#############################################

Running job: my_first_qwen_image_lora_v1

#############################################

Running 1 process Loading Qwen Image model Loading transformer '(MaxRetryError('HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /Qwen/Qwen-Image/resolve/main/transformer/config.json (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001AD4959C880>: Failed to resolve 'huggingface.co' ([Errno 11001] getaddrinfo failed)"))'), '(Request ID: fbd62a1e-cec5-4e02-9bde-ba194eb4dc58)')' thrown while requesting HEAD https://huggingface.co/Qwen/Qwen-Image/resolve/main/transformer/config.json WARNING:huggingface_hub.utils._http:'(MaxRetryError('HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /Qwen/Qwen-Image/resolve/main/transformer/config.json (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001AD4959C880>: Failed to resolve 'huggingface.co' ([Errno 11001] getaddrinfo failed)"))'), '(Request ID: fbd62a1e-cec5-4e02-9bde-ba194eb4dc58)')' thrown while requesting HEAD https://huggingface.co/Qwen/Qwen-Image/resolve/main/transformer/config.json Retrying in 1s [Retry 1/5]. WARNING:huggingface_hub.utils._http:Retrying in 1s [Retry 1/5]. '(MaxRetryError('HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /Qwen/Qwen-Image/resolve/main/transformer/config.json (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001AD4959CCD0>: Failed to resolve 'huggingface.co' ([Errno 11001] getaddrinfo failed)"))'), '(Request ID: 4aa7e002-c502-41d5-b6ab-b9545ef2107c)')' thrown while requesting HEAD https://huggingface.co/Qwen/Qwen-Image/resolve/main/transformer/config.json WARNING:huggingface_hub.utils._http:'(MaxRetryError('HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /Qwen/Qwen-Image/resolve/main/transformer/config.json (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001AD4959CCD0>: Failed to resolve 'huggingface.co' ([Errno 11001] getaddrinfo failed)"))'), '(Request ID: 4aa7e002-c502-41d5-b6ab-b9545ef2107c)')' thrown while requesting HEAD https://huggingface.co/Qwen/Qwen-Image/resolve/main/transformer/config.json Retrying in 2s [Retry 2/5]. WARNING:huggingface_hub.utils._http:Retrying in 2s [Retry 2/5]. '(MaxRetryError('HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /Qwen/Qwen-Image/resolve/main/transformer/config.json (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001AD4959D030>: Failed to resolve 'huggingface.co' ([Errno 11001] getaddrinfo failed)"))'), '(Request ID: d76353ed-1ede-40ec-85b7-dd64a7c60783)')' thrown while requesting HEAD https://huggingface.co/Qwen/Qwen-Image/resolve/main/transformer/config.json WARNING:huggingface_hub.utils._http:'(MaxRetryError('HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /Qwen/Qwen-Image/resolve/main/transformer/config.json (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001AD4959D030>: Failed to resolve 'huggingface.co' ([Errno 11001] getaddrinfo failed)"))'), '(Request ID: d76353ed-1ede-40ec-85b7-dd64a7c60783)')' thrown while requesting HEAD https://huggingface.co/Qwen/Qwen-Image/resolve/main/transformer/config.json Retrying in 4s [Retry 3/5]. WARNING:huggingface_hub.utils._http:Retrying in 4s [Retry 3/5]. '(MaxRetryError('HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /Qwen/Qwen-Image/resolve/main/transformer/config.json (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001AD4959D390>: Failed to resolve 'huggingface.co' ([Errno 11001] getaddrinfo failed)"))'), '(Request ID: 5384a2f4-c165-40f6-9bde-94d921afb154)')' thrown while requesting HEAD https://huggingface.co/Qwen/Qwen-Image/resolve/main/transformer/config.json WARNING:huggingface_hub.utils._http:'(MaxRetryError('HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /Qwen/Qwen-Image/resolve/main/transformer/config.json (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001AD4959D390>: Failed to resolve 'huggingface.co' ([Errno 11001] getaddrinfo failed)"))'), '(Request ID: 5384a2f4-c165-40f6-9bde-94d921afb154)')' thrown while requesting HEAD https://huggingface.co/Qwen/Qwen-Image/resolve/main/transformer/config.json Retrying in 8s [Retry 4/5]. WARNING:huggingface_hub.utils._http:Retrying in 8s [Retry 4/5]. '(MaxRetryError('HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /Qwen/Qwen-Image/resolve/main/transformer/config.json (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001AD4959D6F0>: Failed to resolve 'huggingface.co' ([Errno 11001] getaddrinfo failed)"))'), '(Request ID: e56dbcfc-5d22-4ff2-9231-a635895690d8)')' thrown while requesting HEAD https://huggingface.co/Qwen/Qwen-Image/resolve/main/transformer/config.json WARNING:huggingface_hub.utils._http:'(MaxRetryError('HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /Qwen/Qwen-Image/resolve/main/transformer/config.json (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001AD4959D6F0>: Failed to resolve 'huggingface.co' ([Errno 11001] getaddrinfo failed)"))'), '(Request ID: e56dbcfc-5d22-4ff2-9231-a635895690d8)')' thrown while requesting HEAD https://huggingface.co/Qwen/Qwen-Image/resolve/main/transformer/config.json Retrying in 8s [Retry 5/5]. WARNING:huggingface_hub.utils._http:Retrying in 8s [Retry 5/5]. '(MaxRetryError('HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /Qwen/Qwen-Image/resolve/main/transformer/config.json (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001AD4959D210>: Failed to resolve 'huggingface.co' ([Errno 11001] getaddrinfo failed)"))'), '(Request ID: 55dd9c61-7320-4697-acf5-c94afe5b896a)')' thrown while requesting HEAD https://huggingface.co/Qwen/Qwen-Image/resolve/main/transformer/config.json WARNING:huggingface_hub.utils._http:'(MaxRetryError('HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /Qwen/Qwen-Image/resolve/main/transformer/config.json (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001AD4959D210>: Failed to resolve 'huggingface.co' ([Errno 11001] getaddrinfo failed)"))'), '(Request ID: 55dd9c61-7320-4697-acf5-c94afe5b896a)')' thrown while requesting HEAD https://huggingface.co/Qwen/Qwen-Image/resolve/main/transformer/config.json Error running job: (MaxRetryError('HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /api/models/Qwen/Qwen-Image (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001AD4959D7E0>: Failed to resolve 'huggingface.co' ([Errno 11001] getaddrinfo failed)"))'), '(Request ID: a064f46c-866f-466b-9035-e5a251226fae)')

======================================== Result:

  • 0 completed jobs
  • 1 failure ========================================

samsonsite1 avatar Oct 19 '25 06:10 samsonsite1

兄弟,我也遇到了相同的问题,我看遍整个issue没找到问题,但是答案确实很愚蠢。。。。在我们创建new job的时候,在model选择的地方有一个输入框,叫做Name or Path,可以在这个位置输入已经下载的模型路径,注意是整个huggingface的项目,然后在里面给出已经下载的模型路径即可,注意给到transformer目录的路径。

Songssx avatar Oct 29 '25 19:10 Songssx

To use ai-toolkit offline you need to point to your cache folder including your HF token directory

Find your cache folder and follow the path example to your own model.

Mine was C:\Users\BoB\.cache\huggingface\hub\models--Qwen--Qwen-Image-Edit\snapshots\abcdef12359k23hjrkj23kj23kj23

the abcdef12359k23hjrkj23kj23kj23 will show your own huggingface token

enter that path with this format >> C:\Users\BoB\.cache\huggingface\hub\models--Qwen--Qwen-Image-Edit\snapshots\abcdef12359k23hjrkj23kj23kj23

into your "Name or Path" field in ai-toolkit, located in the model section. it will auto append \transformers to that directory for you when you run a job

if you add it to your advanced section or .json then use the following format >> "name_or_path": "C:\\Users\\BoB\\.cache\\huggingface\\hub\\models--Qwen--Qwen-Image-Edit\\snapshots\\abcdef12359k23hjrkj23kj23kj23",

with the quotes and comma

this allows me to run offline lora training, I still see some errors with the toolkit trying to access HF but the job runs and completes so they aren't critical.

baxinabox avatar Nov 20 '25 07:11 baxinabox

Unfortunately, the workaround provided by @baxinabox is not always sufficient. For example, when running the WAN2.2 I2V 14B model, the training job needs to load the umt5_xxl_encoder transformer. It will again try to connect to HF and errors out with:

Cannot reach https://huggingface.co/api/models/ai-toolkit/umt5_xxl_encoder: offline mode is enabled. To disable it, please unset the HF_HUB_OFFLINEenvironment variable.Cannot reach https://huggingface.co/api/models/ai-toolkit/umt5_xxl_encoder: offline mode is enabled. To disable it, please unset theHF_HUB_OFFLINE environment variable.

How would we tell AI-Toolkit to look for the local transformer?

Full log:

Creating DualWanTransformer3DModel
Loading UMT5EncoderModel
Error running job: Cannot reach https://huggingface.co/api/models/ai-toolkit/umt5_xxl_encoder: offline mode is enabled. To disable it, please unset the `HF_HUB_OFFLINE` environment variable.
========================================
Result:
 - 0 completed jobs
 - 1 failure
========================================
Traceback (most recent call last):
Traceback (most recent call last):
  File "/app/run.py", line 120, in <module>
  File "/app/run.py", line 120, in <module>
        main()main()
  File "/app/run.py", line 108, in main
  File "/app/run.py", line 108, in main
        raise eraise e
  File "/app/run.py", line 96, in main
  File "/app/run.py", line 96, in main
        job.run()job.run()
  File "/app/jobs/ExtensionJob.py", line 22, in run
  File "/app/jobs/ExtensionJob.py", line 22, in run
        process.run()process.run()
  File "/app/jobs/process/BaseSDTrainProcess.py", line 1565, in run
  File "/app/jobs/process/BaseSDTrainProcess.py", line 1565, in run
        self.sd.load_model()self.sd.load_model()
  File "/app/extensions_built_in/diffusion_models/wan22/wan22_14b_model.py", line 228, in load_model
  File "/app/extensions_built_in/diffusion_models/wan22/wan22_14b_model.py", line 228, in load_model
        super().load_model()super().load_model()
  File "/app/toolkit/models/wan21/wan21.py", line 421, in load_model
  File "/app/toolkit/models/wan21/wan21.py", line 421, in load_model
        tokenizer, text_encoder = get_umt5_encoder(tokenizer, text_encoder = get_umt5_encoder(
  File "/app/toolkit/models/loaders/umt5.py", line 20, in get_umt5_encoder
  File "/app/toolkit/models/loaders/umt5.py", line 20, in get_umt5_encoder
        tokenizer = AutoTokenizer.from_pretrained(model_path, subfolder=tokenizer_subfolder)tokenizer = AutoTokenizer.from_pretrained(model_path, subfolder=tokenizer_subfolder)
  File "/home/aitoolkit/.local/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 1156, in from_pretrained
  File "/home/aitoolkit/.local/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 1156, in from_pretrained
        return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
  File "/home/aitoolkit/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2113, in from_pretrained
  File "/home/aitoolkit/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2113, in from_pretrained
        return cls._from_pretrained(return cls._from_pretrained(
  File "/home/aitoolkit/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2395, in _from_pretrained
  File "/home/aitoolkit/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2395, in _from_pretrained
        tokenizer = cls._patch_mistral_regex(tokenizer = cls._patch_mistral_regex(
  File "/home/aitoolkit/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2438, in _patch_mistral_regex
  File "/home/aitoolkit/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2438, in _patch_mistral_regex
        if _is_local or is_base_mistral(pretrained_model_name_or_path):if _is_local or is_base_mistral(pretrained_model_name_or_path):
  File "/home/aitoolkit/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2432, in is_base_mistral
  File "/home/aitoolkit/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2432, in is_base_mistral
        model = model_info(model_id)model = model_info(model_id)
  File "/home/aitoolkit/.local/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
  File "/home/aitoolkit/.local/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
        return fn(*args, **kwargs)return fn(*args, **kwargs)
  File "/home/aitoolkit/.local/lib/python3.10/site-packages/huggingface_hub/hf_api.py", line 2660, in model_info
  File "/home/aitoolkit/.local/lib/python3.10/site-packages/huggingface_hub/hf_api.py", line 2660, in model_info
        r = get_session().get(path, headers=headers, timeout=timeout, params=params)r = get_session().get(path, headers=headers, timeout=timeout, params=params)
  File "/home/aitoolkit/.local/lib/python3.10/site-packages/requests/sessions.py", line 602, in get
  File "/home/aitoolkit/.local/lib/python3.10/site-packages/requests/sessions.py", line 602, in get
        return self.request("GET", url, **kwargs)return self.request("GET", url, **kwargs)
  File "/home/aitoolkit/.local/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
  File "/home/aitoolkit/.local/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
        resp = self.send(prep, **send_kwargs)resp = self.send(prep, **send_kwargs)
  File "/home/aitoolkit/.local/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
  File "/home/aitoolkit/.local/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
        r = adapter.send(request, **kwargs)r = adapter.send(request, **kwargs)
  File "/home/aitoolkit/.local/lib/python3.10/site-packages/huggingface_hub/utils/_http.py", line 106, in send
  File "/home/aitoolkit/.local/lib/python3.10/site-packages/huggingface_hub/utils/_http.py", line 106, in send
        raise OfflineModeIsEnabled(raise OfflineModeIsEnabled(
huggingface_hub.errorshuggingface_hub.errors..OfflineModeIsEnabledOfflineModeIsEnabled: : Cannot reach https://huggingface.co/api/models/ai-toolkit/umt5_xxl_encoder: offline mode is enabled. To disable it, please unset the `HF_HUB_OFFLINE` environment variable.Cannot reach https://huggingface.co/api/models/ai-toolkit/umt5_xxl_encoder: offline mode is enabled. To disable it, please unset the `HF_HUB_OFFLINE` environment variable.

ArlonSwaders avatar Dec 10 '25 15:12 ArlonSwaders

@ArlonSwaders - I just encountered a similar issue, but with the text encoder trying to connect to Huggingface.

Error running job: Cannot reach https://huggingface.co/api/models/Tongyi-MAI/Z-Image-Turbo: offline mode is enabled. To disable it, please unset the HF_HUB_OFFLINE environment variable.

I managed to resolve it, by using a similar approach to that outlined by @baxinabox .

In my instance, I had to locate the cache for Tongyi-MAI/Z-Image-Turbo and take note of the path.

Then I had to manually edit the job and edit extras_name_or_path with that path.

Update the job, and start the queue and hopefully that should work.

Ostris really needs to make this a much simpler process.

DCAU7 avatar Dec 13 '25 00:12 DCAU7

@DCAU7 Thank you very much for the pointer. I'll give that a try soon and report back.

I agree that running AI-Toolkit offline does not seem very straightforward. Documentation on how to run AI-Toolkit in offline mode would already be very helpful.

ArlonSwaders avatar Dec 13 '25 11:12 ArlonSwaders

@ArlonSwaders - Offline capability really should be a priority for Ostris as it seems to be one of the most desired features, and quite honestly, it's just a mess.

I'm going to be without a fast internet connection for a period of time, and will be limited in data, and the last thing I need or want is to download models that have already been downloaded.

I had been wondering why the .job_config.json file was being overwritten on startup, but I realised that it pulls it's jobs from the SQLite3 database, so if you have an appropriate editor, you could technically edit the database, with appropriate settings as well.

But, the below option seems best:

I should note that the ai-toolkit/ui/src/app/jobs/new/options.ts file contains the Model Architecture config and can be edited to include the appropriate paths.

Two lines/config options specifically:

'config.process[0].model.name_or_path': ['ostris/Z-Image-De-Turbo', defaultNameOrPath], 'config.process[0].model.extras_name_or_path': ['Tongyi-MAI/Z-Image-Turbo', undefined],

In my case, I changed them to the appropriate huggingface cache (NOTE: I use Linux, so Windows paths will look different):

'config.process[0].model.name_or_path': ['/home/dc/.cache/huggingface/hub/models--ostris--Z-Image-De-Turbo/snapshots/1234567890123456789012345678901234567890', defaultNameOrPath], 'config.process[0].model.extras_name_or_path': ['/home/dc/.cache/huggingface/hub/models--Tongyi-MAI--Z-Image-Turbo/snapshots/1234567890123456789012345678901234567890/', undefined],

After making this change, I restarted ai-toolkit and reopened a browser. I deleted my old job, created a new one, and things just worked (thank goodness).

Standard disclaimer applies:

  • Always take backups of files before editing
  • Any changes will likely be overwritten at the next update, so take note of specific settings for future reference

Hopefully this helps.

DCAU7 avatar Dec 13 '25 15:12 DCAU7