transformers Unable to load models with adapter weights in offline mode

System Info

transformers version: 4.42.0.dev0
Platform: Linux-5.15.0-1045-aws-x86_64-with-glibc2.31
Python version: 3.10.9
Huggingface_hub version: 0.23.4
Safetensors version: 0.4.2
Accelerate version: 0.31.0
Accelerate config: not found
PyTorch version (GPU?): 2.3.1+cu121 (True)
Tensorflow version (GPU?): 2.14.0 (False)
Flax version (CPU?/GPU?/TPU?): 0.7.0 (cpu)
Jax version: 0.4.13
JaxLib version: 0.4.13
Using distributed or parallel set-up in script?: No
Using GPU in script?: No
GPU type: NVIDIA A10G

Who can help?

Probably me @amyeroberts or @ArthurZucker.

PEFT weight loading code was originally added by @younesbelkada

Information

[ ] The official example scripts
[X] My own modified scripts

Tasks

[ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Reproduction

Unable to load models in offline model, even when the adapter weights are cache locally

import os
import torch

os.environ['HF_HUB_OFFLINE'] = '1'

from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained(
    "haoranxu/ALMA-13B-R",
    torch_dtype=torch.float16,
    device_map="auto",
    local_files_only=True
)

This model uses haoranxu/ALMA-13B-Pretrain as adapter weights.

If you first load the model s.t. the model and adapter weights are available in the cache, and then re-run in offline mode, the following error occurs:

Traceback (most recent call last):
  File "/home/ubuntu/transformers/../scripts/debug_31552_load_without_safetensors.py", line 8, in <module>
    model = AutoModelForCausalLM.from_pretrained(
  File "/home/ubuntu/transformers/src/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
    return model_class.from_pretrained(
  File "/home/ubuntu/transformers/src/transformers/modeling_utils.py", line 3907, in from_pretrained
    model.load_adapter(
  File "/home/ubuntu/transformers/src/transformers/integrations/peft.py", line 201, in load_adapter
    adapter_state_dict = load_peft_weights(peft_model_id, token=token, device=device, **adapter_kwargs)
  File "/data/ml/lib/python3.10/site-packages/peft/utils/save_and_load.py", line 297, in load_peft_weights
    has_remote_safetensors_file = file_exists(
  File "/data/ml/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "/data/ml/lib/python3.10/site-packages/huggingface_hub/hf_api.py", line 2641, in file_exists
    get_hf_file_metadata(url, token=token)
  File "/data/ml/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "/data/ml/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1645, in get_hf_file_metadata
    r = _request_wrapper(
  File "/data/ml/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 372, in _request_wrapper
    response = _request_wrapper(
  File "/data/ml/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 395, in _request_wrapper
    response = get_session().request(method=method, url=url, **params)
  File "/data/ml/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/data/ml/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/data/ml/lib/python3.10/site-packages/huggingface_hub/utils/_http.py", line 77, in send
    raise OfflineModeIsEnabled(
huggingface_hub.errors.OfflineModeIsEnabled: Cannot reach https://huggingface.co/haoranxu/ALMA-13B-R/resolve/main/adapter_model.safetensors: offline mode is enabled. To disable it, please unset the `HF_HUB_OFFLINE` environment variable.

Expected behavior

Can load the model in online and offline mode

Jun 28 '24 15:06 amyeroberts

🫠 sounds like kwargs getting lost maybe?

Jun 28 '24 16:06 ArthurZucker

It's being triggered here in the PEFT library cc @BenjaminBossan

Essentially, path built assumes that if the adapters weight path is local, then it's in the form model_id/adapter_model.safetensors. However, if we've already downloaded the model, it'll be under path/to/cache/.cache/huggingface/hub/models--{REPO_ID}-{MODEL_ID}/snapshots/{COMMIT_REF}/{WEIGHT_NAME}.safetensors

Jun 28 '24 19:06 amyeroberts

Thanks for flagging this, indeed, this breaks with offline mode. @Wauplin do you have a suggestion how we can correctly check if the file has already been locally cached?

Jul 01 '24 11:07 BenjaminBossan

cc @Wauplin if you have the bw :)

Jul 29 '24 08:07 LysandreJik

@Wauplin do you have a suggestion how we can correctly check if the file has already been locally cached?

Easiest way to do that is to use hf_hub_download(..., local_files_only=True) in a try/except statement. If a huggingface_hub.utils.LocalEntryNotFoundError error is raised, then it means the file has not been cached locally. Would that be good for you?

And sorry I missed this notification :see_no_evil:

Jul 29 '24 12:07 Wauplin

I worked on a fix: https://github.com/huggingface/peft/pull/1976. It resolves the issue for me but I had trouble unit testing it, as dynamically setting offline mode in the unit test seems to have no effect :( I think it would still be okay to merge the fix without test but if anyone has an idea how to test it correctly, please LMK.

Jul 30 '24 11:07 BenjaminBossan

transformers transformers copied to clipboard

Unable to load models with adapter weights in offline mode

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

transformers
transformers copied to clipboard