lora
lora copied to clipboard
urllib3 connection error?
I am trying to finetune the code with my custom dataset (50000 images with text descriptions).
and it stops during the training phase.
It seems it happens whenever the code trying to get the pre-trained model from Huggingface.
How can I avoid this? any suggestions would be grateful.
the following lines are error messages that I got. they are a bit cut off because of the limited scroll bar :(.
socket.gaierror: [Errno -3] Temporary failure in name resolution
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/urllib3/connectionpool.py", line 703, in urlopen
httplib_response = self._make_request(
File "/usr/local/lib/python3.8/dist-packages/urllib3/connectionpool.py", line 386, in _make_request
self._validate_conn(conn)
File "/usr/local/lib/python3.8/dist-packages/urllib3/connectionpool.py", line 1042, in _validate_conn
conn.connect()
File "/usr/local/lib/python3.8/dist-packages/urllib3/connection.py", line 358, in connect
self.sock = conn = self._new_conn()
File "/usr/local/lib/python3.8/dist-packages/urllib3/connection.py", line 186, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f7e41ba20a0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/requests/adapters.py", line 489, in send
resp = conn.urlopen(
File "/usr/local/lib/python3.8/dist-packages/urllib3/connectionpool.py", line 787, in urlopen
retries = retries.increment(
File "/usr/local/lib/python3.8/dist-packages/urllib3/util/retry.py", line 592, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /api/models/stabilityai/stable-diffusion-2-1-base (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f7e41ba20a0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "train_lora_dreambooth_mycode.py", line 1644, in <module>
main(args)
File "train_lora_dreambooth_mycode.py", line 937, in main
pipeline = StableDiffusionPipeline.from_pretrained(
File "/usr/local/lib/python3.8/dist-packages/diffusers/pipelines/pipeline_utils.py", line 530, in from_pretrained
info = model_info(
File "/usr/local/lib/python3.8/dist-packages/huggingface_hub/utils/_validators.py", line 124, in _inner_fn
return fn(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/huggingface_hub/hf_api.py", line 1228, in model_info
r = requests.get(
File "/usr/local/lib/python3.8/dist-packages/requests/api.py", line 73, in get
return request("get", url, params=params, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/requests/api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/requests/sessions.py", line 587, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python3.8/dist-packages/requests/sessions.py", line 701, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/requests/adapters.py", line 565, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /api/models/stabilityai/stable-diffusion-2-1-base (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f7e41ba20a0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))
Steps: 21%|███████████████████████████████████▊ | 105000/509550 [9:50:37<37:55:35, 2.96it/s, loss=0.0773, lr=0.0001]
Traceback (most recent call last):
File "/usr/local/bin/accelerate", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/accelerate_cli.py", line 45, in main
args.func(args)
File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/launch.py", line 1097, in launch_command
simple_launcher(args)
File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/launch.py", line 552, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', 'train_lora_dreambooth_mycode.py', '--pretrained_model_name_or_path=stabilityai/stable-diffusion-2-1-base', '--instance_data_dir=../../datasets/my_data', '--output_dir=./output_example_text_v0.9', '--instance_prompt=deprecated, using captions', '--train_text_encoder', '--resolution=512', '--train_batch_size=1', '--gradient_accumulation_steps=1', '--learning_rate=1e-4', '--learning_rate_text=5e-5', '--color_jitter', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--num_train_epochs=10', '--save_steps=5000']' returned non-zero exit status 1.