[Bug Report] Load model problem
Hello, I have a strange phenomenon. This makes me very puzzled.
I use the following code to load the GPT2-xl model locally, but it can run and load normally in a Jupyter file. When I use another script to load the model, I keep reporting that I am downloading it from hugging face official website, but my machine can't connect to hugging face.
Two Jupyter files use the same conda environment, and the running results are as follows:
加载失败的文件报错如下:
OSError Traceback (most recent call last)
File ~/miniconda3/envs/knowledgecircuit/lib/python3.10/site-packages/urllib3/connection.py:199, in HTTPConnection._new_conn(self)
198 try:
--> 199 sock = connection.create_connection(
200 (self._dns_host, self.port),
201 self.timeout,
202 source_address=self.source_address,
203 socket_options=self.socket_options,
204 )
205 except socket.gaierror as e:
File ~/miniconda3/envs/knowledgecircuit/lib/python3.10/site-packages/urllib3/util/connection.py:85, in create_connection(address, timeout, source_address, socket_options)
84 try:
---> 85 raise err
86 finally:
87 # Break explicitly a reference cycle
File ~/miniconda3/envs/knowledgecircuit/lib/python3.10/site-packages/urllib3/util/connection.py:73, in create_connection(address, timeout, source_address, socket_options)
72 sock.bind(source_address)
---> 73 sock.connect(sa)
74 # Break explicitly a reference cycle
OSError: [Errno 101] Network is unreachable
...
431 except EntryNotFoundError as e:
432 if not _raise_exceptions_for_missing_entries:
OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like gpt2-xl is not the path to a directory containing a file named config.json.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.
Load the model code as follows:
device = "cuda:0"
gpt2_medium_path = '/data/liujinzhe/model/openai-community/gpt2-xl'
hf_model = AutoModelForCausalLM.from_pretrained(gpt2_medium_path)
tokenizer = AutoTokenizer.from_pretrained(gpt2_medium_path)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "left"
model = HookedTransformer.from_pretrained(
model_name= "gpt2-xl",
hf_model=hf_model,
tokenizer=tokenizer,
local_path=gpt2_medium_path,
center_unembed=False,
center_writing_weights=False,
fold_ln=True,
device=device,
# refactor_factored_attn_matrices=True,
)
model.cfg.use_split_qkv_input = True
model.cfg.use_attn_result = True
model.cfg.use_hook_mlp_in = True
Do you have any way to have access to huggingface? This is a known issue where we are currently downloading config from huggingface, even when config is passed through https://github.com/TransformerLensOrg/TransformerLens/issues/754. It will be patched at some point, but the easiest solution today is to make sure you have access to HuggingFace. If it is not possible for you to have access, let me know.
I am trying to load codellama and getting the same error. I have logged in using the hugging face cli and have access to the model as I can use it normally without transformer lens. Any idea what is happening? Code:
llama = HookedTransformer.from_pretrained(
model_name="CodeLlama-7b-Python",
)
Error:
OSError: CodeLlama-7b-Python-hf is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`
Sorry, I can't access the hugging face online loading model due to environmental constraints. I want to load it through the local model. Is there any good solution?@bryce13950
I encountered the same issue and have been able to solve it by inspecting the library code.
The model I used is gpt2-small. When I download it from HuggingFace, the folder (folder X) that contains the config-related files (config.json, generation_config.json, ...) is nested inside other folders.
The way I solved this issue was to COPY folder X (moving and dropping it to another place caused the "invalid symlink" issue), then I went to the folder that contains the script to load the model into HookedTransformer (folder Y), and PASTE folder X into folder Y, renamed the folder X to "gpt2-small" (same with the official model name), and folder X is now in the same folder as the script. If this does not work, another method is to download the necessary config-related files directly from the model repository on HuggingFace, put all of them into a folder (with the folder name the same as the official model name), and place that folder into folder Y.
model = HookedTransformer.from_pretrained("gpt2-small") (This works without requiring access to huggingface.)
This works in my case and I hope it can help someone facing the same issue.
from transformer_lens import HookedTransformer
real_path = "real_model_path"
import os
os.symlink(real_path, "gpt2") # tmp redirection
model = HookedTransformer.from_pretrained(model_name="gpt2")
os.remove("gpt2")
print(model) # align with config.json in real_path