Cannot run trainer offline
This is for bugs only
Did you already ask in the discord? No
You verified that this is a bug and not a feature request or question by asking in the discord? No
Describe the bug
I cannot run the trainer script offline. It constantly tries to connect to huggingface.co looking for transformer/config.json. I already have this file stored/cached on my computer. I have successfully run the script once while connected, and all necessary files were downloaded, including transformer/config.json. I then went offline, and tried it again, but the script still tries to connect. It shouldn't do this.
Later, in the script, it then tries to download: /Qwen/Qwen-Image/resolve/main/tokenizer/tokenizer_config.json /Qwen/Qwen-Image/resolve/main/vae/config.json
These files are already stored/cached on my computer.
python run.py config\examples\train_lora_qwen_image_24gb.yaml
ERROR LOG:
#############################################
Running job: my_first_qwen_image_lora_v1
#############################################
Running 1 process Loading Qwen Image model Loading transformer '(MaxRetryError('HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /Qwen/Qwen-Image/resolve/main/transformer/config.json (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001AD4959C880>: Failed to resolve 'huggingface.co' ([Errno 11001] getaddrinfo failed)"))'), '(Request ID: fbd62a1e-cec5-4e02-9bde-ba194eb4dc58)')' thrown while requesting HEAD https://huggingface.co/Qwen/Qwen-Image/resolve/main/transformer/config.json WARNING:huggingface_hub.utils._http:'(MaxRetryError('HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /Qwen/Qwen-Image/resolve/main/transformer/config.json (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001AD4959C880>: Failed to resolve 'huggingface.co' ([Errno 11001] getaddrinfo failed)"))'), '(Request ID: fbd62a1e-cec5-4e02-9bde-ba194eb4dc58)')' thrown while requesting HEAD https://huggingface.co/Qwen/Qwen-Image/resolve/main/transformer/config.json Retrying in 1s [Retry 1/5]. WARNING:huggingface_hub.utils._http:Retrying in 1s [Retry 1/5]. '(MaxRetryError('HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /Qwen/Qwen-Image/resolve/main/transformer/config.json (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001AD4959CCD0>: Failed to resolve 'huggingface.co' ([Errno 11001] getaddrinfo failed)"))'), '(Request ID: 4aa7e002-c502-41d5-b6ab-b9545ef2107c)')' thrown while requesting HEAD https://huggingface.co/Qwen/Qwen-Image/resolve/main/transformer/config.json WARNING:huggingface_hub.utils._http:'(MaxRetryError('HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /Qwen/Qwen-Image/resolve/main/transformer/config.json (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001AD4959CCD0>: Failed to resolve 'huggingface.co' ([Errno 11001] getaddrinfo failed)"))'), '(Request ID: 4aa7e002-c502-41d5-b6ab-b9545ef2107c)')' thrown while requesting HEAD https://huggingface.co/Qwen/Qwen-Image/resolve/main/transformer/config.json Retrying in 2s [Retry 2/5]. WARNING:huggingface_hub.utils._http:Retrying in 2s [Retry 2/5]. '(MaxRetryError('HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /Qwen/Qwen-Image/resolve/main/transformer/config.json (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001AD4959D030>: Failed to resolve 'huggingface.co' ([Errno 11001] getaddrinfo failed)"))'), '(Request ID: d76353ed-1ede-40ec-85b7-dd64a7c60783)')' thrown while requesting HEAD https://huggingface.co/Qwen/Qwen-Image/resolve/main/transformer/config.json WARNING:huggingface_hub.utils._http:'(MaxRetryError('HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /Qwen/Qwen-Image/resolve/main/transformer/config.json (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001AD4959D030>: Failed to resolve 'huggingface.co' ([Errno 11001] getaddrinfo failed)"))'), '(Request ID: d76353ed-1ede-40ec-85b7-dd64a7c60783)')' thrown while requesting HEAD https://huggingface.co/Qwen/Qwen-Image/resolve/main/transformer/config.json Retrying in 4s [Retry 3/5]. WARNING:huggingface_hub.utils._http:Retrying in 4s [Retry 3/5]. '(MaxRetryError('HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /Qwen/Qwen-Image/resolve/main/transformer/config.json (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001AD4959D390>: Failed to resolve 'huggingface.co' ([Errno 11001] getaddrinfo failed)"))'), '(Request ID: 5384a2f4-c165-40f6-9bde-94d921afb154)')' thrown while requesting HEAD https://huggingface.co/Qwen/Qwen-Image/resolve/main/transformer/config.json WARNING:huggingface_hub.utils._http:'(MaxRetryError('HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /Qwen/Qwen-Image/resolve/main/transformer/config.json (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001AD4959D390>: Failed to resolve 'huggingface.co' ([Errno 11001] getaddrinfo failed)"))'), '(Request ID: 5384a2f4-c165-40f6-9bde-94d921afb154)')' thrown while requesting HEAD https://huggingface.co/Qwen/Qwen-Image/resolve/main/transformer/config.json Retrying in 8s [Retry 4/5]. WARNING:huggingface_hub.utils._http:Retrying in 8s [Retry 4/5]. '(MaxRetryError('HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /Qwen/Qwen-Image/resolve/main/transformer/config.json (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001AD4959D6F0>: Failed to resolve 'huggingface.co' ([Errno 11001] getaddrinfo failed)"))'), '(Request ID: e56dbcfc-5d22-4ff2-9231-a635895690d8)')' thrown while requesting HEAD https://huggingface.co/Qwen/Qwen-Image/resolve/main/transformer/config.json WARNING:huggingface_hub.utils._http:'(MaxRetryError('HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /Qwen/Qwen-Image/resolve/main/transformer/config.json (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001AD4959D6F0>: Failed to resolve 'huggingface.co' ([Errno 11001] getaddrinfo failed)"))'), '(Request ID: e56dbcfc-5d22-4ff2-9231-a635895690d8)')' thrown while requesting HEAD https://huggingface.co/Qwen/Qwen-Image/resolve/main/transformer/config.json Retrying in 8s [Retry 5/5]. WARNING:huggingface_hub.utils._http:Retrying in 8s [Retry 5/5]. '(MaxRetryError('HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /Qwen/Qwen-Image/resolve/main/transformer/config.json (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001AD4959D210>: Failed to resolve 'huggingface.co' ([Errno 11001] getaddrinfo failed)"))'), '(Request ID: 55dd9c61-7320-4697-acf5-c94afe5b896a)')' thrown while requesting HEAD https://huggingface.co/Qwen/Qwen-Image/resolve/main/transformer/config.json WARNING:huggingface_hub.utils._http:'(MaxRetryError('HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /Qwen/Qwen-Image/resolve/main/transformer/config.json (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001AD4959D210>: Failed to resolve 'huggingface.co' ([Errno 11001] getaddrinfo failed)"))'), '(Request ID: 55dd9c61-7320-4697-acf5-c94afe5b896a)')' thrown while requesting HEAD https://huggingface.co/Qwen/Qwen-Image/resolve/main/transformer/config.json Error running job: (MaxRetryError('HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /api/models/Qwen/Qwen-Image (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001AD4959D7E0>: Failed to resolve 'huggingface.co' ([Errno 11001] getaddrinfo failed)"))'), '(Request ID: a064f46c-866f-466b-9035-e5a251226fae)')
======================================== Result:
- 0 completed jobs
- 1 failure ========================================
兄弟,我也遇到了相同的问题,我看遍整个issue没找到问题,但是答案确实很愚蠢。。。。在我们创建new job的时候,在model选择的地方有一个输入框,叫做Name or Path,可以在这个位置输入已经下载的模型路径,注意是整个huggingface的项目,然后在里面给出已经下载的模型路径即可,注意给到transformer目录的路径。
To use ai-toolkit offline you need to point to your cache folder including your HF token directory
Find your cache folder and follow the path example to your own model.
Mine was C:\Users\BoB\.cache\huggingface\hub\models--Qwen--Qwen-Image-Edit\snapshots\abcdef12359k23hjrkj23kj23kj23
the abcdef12359k23hjrkj23kj23kj23 will show your own huggingface token
enter that path with this format >> C:\Users\BoB\.cache\huggingface\hub\models--Qwen--Qwen-Image-Edit\snapshots\abcdef12359k23hjrkj23kj23kj23
into your "Name or Path" field in ai-toolkit, located in the model section. it will auto append \transformers to that directory for you when you run a job
if you add it to your advanced section or .json then use the following format >> "name_or_path": "C:\\Users\\BoB\\.cache\\huggingface\\hub\\models--Qwen--Qwen-Image-Edit\\snapshots\\abcdef12359k23hjrkj23kj23kj23",
with the quotes and comma
this allows me to run offline lora training, I still see some errors with the toolkit trying to access HF but the job runs and completes so they aren't critical.
Unfortunately, the workaround provided by @baxinabox is not always sufficient. For example, when running the WAN2.2 I2V 14B model, the training job needs to load the umt5_xxl_encoder transformer. It will again try to connect to HF and errors out with:
Cannot reach https://huggingface.co/api/models/ai-toolkit/umt5_xxl_encoder: offline mode is enabled. To disable it, please unset the HF_HUB_OFFLINEenvironment variable.Cannot reach https://huggingface.co/api/models/ai-toolkit/umt5_xxl_encoder: offline mode is enabled. To disable it, please unset theHF_HUB_OFFLINE environment variable.
How would we tell AI-Toolkit to look for the local transformer?
Full log:
Creating DualWanTransformer3DModel
Loading UMT5EncoderModel
Error running job: Cannot reach https://huggingface.co/api/models/ai-toolkit/umt5_xxl_encoder: offline mode is enabled. To disable it, please unset the `HF_HUB_OFFLINE` environment variable.
========================================
Result:
- 0 completed jobs
- 1 failure
========================================
Traceback (most recent call last):
Traceback (most recent call last):
File "/app/run.py", line 120, in <module>
File "/app/run.py", line 120, in <module>
main()main()
File "/app/run.py", line 108, in main
File "/app/run.py", line 108, in main
raise eraise e
File "/app/run.py", line 96, in main
File "/app/run.py", line 96, in main
job.run()job.run()
File "/app/jobs/ExtensionJob.py", line 22, in run
File "/app/jobs/ExtensionJob.py", line 22, in run
process.run()process.run()
File "/app/jobs/process/BaseSDTrainProcess.py", line 1565, in run
File "/app/jobs/process/BaseSDTrainProcess.py", line 1565, in run
self.sd.load_model()self.sd.load_model()
File "/app/extensions_built_in/diffusion_models/wan22/wan22_14b_model.py", line 228, in load_model
File "/app/extensions_built_in/diffusion_models/wan22/wan22_14b_model.py", line 228, in load_model
super().load_model()super().load_model()
File "/app/toolkit/models/wan21/wan21.py", line 421, in load_model
File "/app/toolkit/models/wan21/wan21.py", line 421, in load_model
tokenizer, text_encoder = get_umt5_encoder(tokenizer, text_encoder = get_umt5_encoder(
File "/app/toolkit/models/loaders/umt5.py", line 20, in get_umt5_encoder
File "/app/toolkit/models/loaders/umt5.py", line 20, in get_umt5_encoder
tokenizer = AutoTokenizer.from_pretrained(model_path, subfolder=tokenizer_subfolder)tokenizer = AutoTokenizer.from_pretrained(model_path, subfolder=tokenizer_subfolder)
File "/home/aitoolkit/.local/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 1156, in from_pretrained
File "/home/aitoolkit/.local/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 1156, in from_pretrained
return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "/home/aitoolkit/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2113, in from_pretrained
File "/home/aitoolkit/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2113, in from_pretrained
return cls._from_pretrained(return cls._from_pretrained(
File "/home/aitoolkit/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2395, in _from_pretrained
File "/home/aitoolkit/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2395, in _from_pretrained
tokenizer = cls._patch_mistral_regex(tokenizer = cls._patch_mistral_regex(
File "/home/aitoolkit/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2438, in _patch_mistral_regex
File "/home/aitoolkit/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2438, in _patch_mistral_regex
if _is_local or is_base_mistral(pretrained_model_name_or_path):if _is_local or is_base_mistral(pretrained_model_name_or_path):
File "/home/aitoolkit/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2432, in is_base_mistral
File "/home/aitoolkit/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2432, in is_base_mistral
model = model_info(model_id)model = model_info(model_id)
File "/home/aitoolkit/.local/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
File "/home/aitoolkit/.local/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)return fn(*args, **kwargs)
File "/home/aitoolkit/.local/lib/python3.10/site-packages/huggingface_hub/hf_api.py", line 2660, in model_info
File "/home/aitoolkit/.local/lib/python3.10/site-packages/huggingface_hub/hf_api.py", line 2660, in model_info
r = get_session().get(path, headers=headers, timeout=timeout, params=params)r = get_session().get(path, headers=headers, timeout=timeout, params=params)
File "/home/aitoolkit/.local/lib/python3.10/site-packages/requests/sessions.py", line 602, in get
File "/home/aitoolkit/.local/lib/python3.10/site-packages/requests/sessions.py", line 602, in get
return self.request("GET", url, **kwargs)return self.request("GET", url, **kwargs)
File "/home/aitoolkit/.local/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
File "/home/aitoolkit/.local/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)resp = self.send(prep, **send_kwargs)
File "/home/aitoolkit/.local/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
File "/home/aitoolkit/.local/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)r = adapter.send(request, **kwargs)
File "/home/aitoolkit/.local/lib/python3.10/site-packages/huggingface_hub/utils/_http.py", line 106, in send
File "/home/aitoolkit/.local/lib/python3.10/site-packages/huggingface_hub/utils/_http.py", line 106, in send
raise OfflineModeIsEnabled(raise OfflineModeIsEnabled(
huggingface_hub.errorshuggingface_hub.errors..OfflineModeIsEnabledOfflineModeIsEnabled: : Cannot reach https://huggingface.co/api/models/ai-toolkit/umt5_xxl_encoder: offline mode is enabled. To disable it, please unset the `HF_HUB_OFFLINE` environment variable.Cannot reach https://huggingface.co/api/models/ai-toolkit/umt5_xxl_encoder: offline mode is enabled. To disable it, please unset the `HF_HUB_OFFLINE` environment variable.
@ArlonSwaders - I just encountered a similar issue, but with the text encoder trying to connect to Huggingface.
Error running job: Cannot reach https://huggingface.co/api/models/Tongyi-MAI/Z-Image-Turbo: offline mode is enabled. To disable it, please unset the HF_HUB_OFFLINE environment variable.
I managed to resolve it, by using a similar approach to that outlined by @baxinabox .
In my instance, I had to locate the cache for Tongyi-MAI/Z-Image-Turbo and take note of the path.
Then I had to manually edit the job and edit extras_name_or_path with that path.
Update the job, and start the queue and hopefully that should work.
Ostris really needs to make this a much simpler process.
@DCAU7 Thank you very much for the pointer. I'll give that a try soon and report back.
I agree that running AI-Toolkit offline does not seem very straightforward. Documentation on how to run AI-Toolkit in offline mode would already be very helpful.
@ArlonSwaders - Offline capability really should be a priority for Ostris as it seems to be one of the most desired features, and quite honestly, it's just a mess.
I'm going to be without a fast internet connection for a period of time, and will be limited in data, and the last thing I need or want is to download models that have already been downloaded.
I had been wondering why the .job_config.json file was being overwritten on startup, but I realised that it pulls it's jobs from the SQLite3 database, so if you have an appropriate editor, you could technically edit the database, with appropriate settings as well.
But, the below option seems best:
I should note that the ai-toolkit/ui/src/app/jobs/new/options.ts file contains the Model Architecture config and can be edited to include the appropriate paths.
Two lines/config options specifically:
'config.process[0].model.name_or_path': ['ostris/Z-Image-De-Turbo', defaultNameOrPath], 'config.process[0].model.extras_name_or_path': ['Tongyi-MAI/Z-Image-Turbo', undefined],
In my case, I changed them to the appropriate huggingface cache (NOTE: I use Linux, so Windows paths will look different):
'config.process[0].model.name_or_path': ['/home/dc/.cache/huggingface/hub/models--ostris--Z-Image-De-Turbo/snapshots/1234567890123456789012345678901234567890', defaultNameOrPath], 'config.process[0].model.extras_name_or_path': ['/home/dc/.cache/huggingface/hub/models--Tongyi-MAI--Z-Image-Turbo/snapshots/1234567890123456789012345678901234567890/', undefined],
After making this change, I restarted ai-toolkit and reopened a browser. I deleted my old job, created a new one, and things just worked (thank goodness).
Standard disclaimer applies:
- Always take backups of files before editing
- Any changes will likely be overwritten at the next update, so take note of specific settings for future reference
Hopefully this helps.