Easy-Transformer icon indicating copy to clipboard operation
Easy-Transformer copied to clipboard

[Bug Report] Load model problem

Open LiuJinzhe-Keepgoing opened this issue 1 year ago • 4 comments

Hello, I have a strange phenomenon. This makes me very puzzled. I use the following code to load the GPT2-xl model locally, but it can run and load normally in a Jupyter file. When I use another script to load the model, I keep reporting that I am downloading it from hugging face official website, but my machine can't connect to hugging face. Two Jupyter files use the same conda environment, and the running results are as follows:
WeChat3b77ca23c7174d41b7ddb113e3d4c866 WeChat36deddf5f6fa37fe417206ce9815c8a5

加载失败的文件报错如下:


OSError                                   Traceback (most recent call last)
File ~/miniconda3/envs/knowledgecircuit/lib/python3.10/site-packages/urllib3/connection.py:199, in HTTPConnection._new_conn(self)
    198 try:
--> 199     sock = connection.create_connection(
    200         (self._dns_host, self.port),
    201         self.timeout,
    202         source_address=self.source_address,
    203         socket_options=self.socket_options,
    204     )
    205 except socket.gaierror as e:

File ~/miniconda3/envs/knowledgecircuit/lib/python3.10/site-packages/urllib3/util/connection.py:85, in create_connection(address, timeout, source_address, socket_options)
     84 try:
---> 85     raise err
     86 finally:
     87     # Break explicitly a reference cycle

File ~/miniconda3/envs/knowledgecircuit/lib/python3.10/site-packages/urllib3/util/connection.py:73, in create_connection(address, timeout, source_address, socket_options)
     72     sock.bind(source_address)
---> 73 sock.connect(sa)
     74 # Break explicitly a reference cycle

OSError: [Errno 101] Network is unreachable
...
    431 except EntryNotFoundError as e:
    432     if not _raise_exceptions_for_missing_entries:

OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like gpt2-xl is not the path to a directory containing a file named config.json.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.

Load the model code as follows:

device = "cuda:0"
gpt2_medium_path = '/data/liujinzhe/model/openai-community/gpt2-xl'
hf_model = AutoModelForCausalLM.from_pretrained(gpt2_medium_path)
tokenizer = AutoTokenizer.from_pretrained(gpt2_medium_path)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "left"
model = HookedTransformer.from_pretrained(
    model_name= "gpt2-xl",
    hf_model=hf_model, 
    tokenizer=tokenizer,
    local_path=gpt2_medium_path,
    center_unembed=False,
    center_writing_weights=False,
    fold_ln=True,
    device=device,
    # refactor_factored_attn_matrices=True,
) 
 
model.cfg.use_split_qkv_input = True
model.cfg.use_attn_result = True
model.cfg.use_hook_mlp_in = True

LiuJinzhe-Keepgoing avatar Nov 27 '24 08:11 LiuJinzhe-Keepgoing

Do you have any way to have access to huggingface? This is a known issue where we are currently downloading config from huggingface, even when config is passed through https://github.com/TransformerLensOrg/TransformerLens/issues/754. It will be patched at some point, but the easiest solution today is to make sure you have access to HuggingFace. If it is not possible for you to have access, let me know.

bryce13950 avatar Nov 27 '24 22:11 bryce13950

I am trying to load codellama and getting the same error. I have logged in using the hugging face cli and have access to the model as I can use it normally without transformer lens. Any idea what is happening? Code:

llama = HookedTransformer.from_pretrained(
    model_name="CodeLlama-7b-Python",
)

Error:

OSError: CodeLlama-7b-Python-hf is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`

akul-sethi avatar Nov 29 '24 01:11 akul-sethi

Sorry, I can't access the hugging face online loading model due to environmental constraints. I want to load it through the local model. Is there any good solution?@bryce13950

LiuJinzhe-Keepgoing avatar Nov 29 '24 02:11 LiuJinzhe-Keepgoing

I encountered the same issue and have been able to solve it by inspecting the library code.

The model I used is gpt2-small. When I download it from HuggingFace, the folder (folder X) that contains the config-related files (config.json, generation_config.json, ...) is nested inside other folders. image

The way I solved this issue was to COPY folder X (moving and dropping it to another place caused the "invalid symlink" issue), then I went to the folder that contains the script to load the model into HookedTransformer (folder Y), and PASTE folder X into folder Y, renamed the folder X to "gpt2-small" (same with the official model name), and folder X is now in the same folder as the script. If this does not work, another method is to download the necessary config-related files directly from the model repository on HuggingFace, put all of them into a folder (with the folder name the same as the official model name), and place that folder into folder Y.

model = HookedTransformer.from_pretrained("gpt2-small") (This works without requiring access to huggingface.)

This works in my case and I hope it can help someone facing the same issue.

HaThuyAn avatar Jan 08 '25 08:01 HaThuyAn

from transformer_lens import HookedTransformer

real_path = "real_model_path"

import os
os.symlink(real_path, "gpt2")  # tmp redirection

model = HookedTransformer.from_pretrained(model_name="gpt2")

os.remove("gpt2")

print(model)  # align with config.json in real_path

Jiaran-Ye avatar Sep 17 '25 08:09 Jiaran-Ye