qlora icon indicating copy to clipboard operation
qlora copied to clipboard

getting error on loading model

Open SasiKiranK opened this issue 1 year ago • 11 comments

When i try this in colab tokenizer = AutoTokenizer.from_pretrained("emilianJR/CyberRealistic_V3") model = AutoModelForCausalLM.from_pretrained("emilianJR/CyberRealistic_V3")

HTTPError Traceback (most recent call last) /usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py in hf_raise_for_status(response, endpoint_name) 258 try: --> 259 response.raise_for_status() 260 except HTTPError as e:

12 frames HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/emilianJR/CyberRealistic_V3/resolve/main/config.json

The above exception was the direct cause of the following exception:

EntryNotFoundError Traceback (most recent call last) EntryNotFoundError: 404 Client Error. (Request ID: Root=1-6482c26e-527cb8985586f80e2bd7bbc6)

Entry Not Found for url: https://huggingface.co/emilianJR/CyberRealistic_V3/resolve/main/config.json.

During handling of the above exception, another exception occurred:

OSError Traceback (most recent call last) /usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py in cached_file(path_or_repo_id, filename, cache_dir, force_download, resume_download, proxies, use_auth_token, revision, local_files_only, subfolder, repo_type, user_agent, _raise_exceptions_for_missing_entries, _raise_exceptions_for_connection_errors, _commit_hash) 461 if revision is None: 462 revision = "main" --> 463 raise EnvironmentError( 464 f"{path_or_repo_id} does not appear to have a file named {full_filename}. Checkout " 465 f"'https://huggingface.co/{path_or_repo_id}/{revision}' for available files."

OSError: emilianJR/CyberRealistic_V3 does not appear to have a file named config.json. Checkout 'https://huggingface.co/emilianJR/CyberRealistic_V3/main' for available files.

When i try to tune !python /content/qlora/qlora.py https://huggingface.co/emilianJR/CyberRealistic_V3/resolve/main/unet/diffusion_pytorch_model.bin /content/new

getting this error ===================================BUG REPORT=================================== Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues

bin /usr/local/lib/python3.10/dist-packages/bitsandbytes/libbitsandbytes_cuda118.so /usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: /usr/lib64-nvidia did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] as expected! Searching further paths... warn(msg) /usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/sys/fs/cgroup/memory.events /var/colab/cgroup/jupyter-children/memory.events')} warn(msg) /usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('8013'), PosixPath('http'), PosixPath('//172.28.0.1')} warn(msg) /usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('--logtostderr --listen_host=172.28.0.12 --target_host=172.28.0.12 --tunnel_background_save_url=https'), PosixPath('//colab.research.google.com/tun/m/cc48301118ce562b961b3c22d803539adc1e0c19/gpu-t4-s-tfe6ei5vsc3s --tunnel_background_save_delay=10s --tunnel_periodic_background_save_frequency=30m0s --enable_output_coalescing=true --output_coalescing_required=true')} warn(msg) /usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/env/python')} warn(msg) /usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('//ipykernel.pylab.backend_inline'), PosixPath('module')} warn(msg) CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths... /usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: Found duplicate ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] files: {PosixPath('/usr/local/cuda/lib64/libcudart.so.11.0'), PosixPath('/usr/local/cuda/lib64/libcudart.so')}.. We'll flip a coin and try one of these, in order to fail forward. Either way, this might cause trouble in the future: If you get CUDA error: invalid device function errors, the above might be the cause and the solution is to make sure only one ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] in the paths that we search based on your env. warn(msg) CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so.11.0 CUDA SETUP: Highest compute capability among GPUs detected: 7.5 CUDA SETUP: Detected CUDA version 118 CUDA SETUP: Loading binary /usr/local/lib/python3.10/dist-packages/bitsandbytes/libbitsandbytes_cuda118.so... 2023-06-09 06:13:31.514413: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT loading base model EleutherAI/pythia-12b... The model weights are not tied. Please use the tie_weights method before using the infer_auto_device function. Loading checkpoint shards: 0% 0/3 [00:00<?, ?it/s]

SasiKiranK avatar Jun 09 '23 06:06 SasiKiranK

I got the same issue ><

MurphyJUAN avatar Jun 09 '23 21:06 MurphyJUAN

following

ra-MANUJ-an avatar Jun 10 '23 01:06 ra-MANUJ-an

Same

rbrus avatar Jun 11 '23 07:06 rbrus

Found a big help. If you use oobabooga it is probably because you downloaded it per zip. Just do:

git clone https://github.com/oobabooga/text-generation-webui cd text-generation-webui pip install -r requirements.txt

conda activate textgen cd text-generation-webui pip install -r requirements.txt --upgrade

python download-model.py facebook/opt-1.3b

And then it should work without any warnings or issues

ShadowSlimey avatar Jun 11 '23 23:06 ShadowSlimey

same error tried multiple machines on vast.ai they all give this error

The model weights are not tied. Please use the `tie_weights` method before using the `infer_auto_device` function.
Loading checkpoint shards:   0%|                                                                                           | 0/2 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/root/qlora/qlora.py", line 807, in <module>
    train()
  File "/root/qlora/qlora.py", line 643, in train
    model = get_accelerate_model(args, checkpoint_dir)
  File "/root/qlora/qlora.py", line 280, in get_accelerate_model
    model = AutoModelForCausalLM.from_pretrained(
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 484, in from_pretrained
    return model_class.from_pretrained(
  File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 2881, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3227, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 728, in _load_state_dict_into_meta_model
    set_module_quantized_tensor_to_device(
  File "/usr/local/lib/python3.10/dist-packages/transformers/utils/bitsandbytes.py", line 101, in set_module_quantized_tensor_to_device
    new_value = value.to(device)
  File "/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py", line 247, in _lazy_init
    torch._C._cuda_init()
RuntimeError: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 804: forward compatibility was attempted on non supported HW

ewof avatar Jun 13 '23 02:06 ewof

same error tried multiple machines on vast.ai they all give this error

The model weights are not tied. Please use the `tie_weights` method before using the `infer_auto_device` function.
Loading checkpoint shards:   0%|                                                                                           | 0/2 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/root/qlora/qlora.py", line 807, in <module>
    train()
  File "/root/qlora/qlora.py", line 643, in train
    model = get_accelerate_model(args, checkpoint_dir)
  File "/root/qlora/qlora.py", line 280, in get_accelerate_model
    model = AutoModelForCausalLM.from_pretrained(
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 484, in from_pretrained
    return model_class.from_pretrained(
  File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 2881, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3227, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 728, in _load_state_dict_into_meta_model
    set_module_quantized_tensor_to_device(
  File "/usr/local/lib/python3.10/dist-packages/transformers/utils/bitsandbytes.py", line 101, in set_module_quantized_tensor_to_device
    new_value = value.to(device)
  File "/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py", line 247, in _lazy_init
    torch._C._cuda_init()
RuntimeError: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 804: forward compatibility was attempted on non supported HW

Try this: from transformers import AutoModelForCausalLM, AutoTokenizer

Load the model and tokenizer

model_name = "your_model_name" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name)

Tie the weights

model.tie_weights()

Now you can use the model for inference without encountering the error

ShadowSlimey avatar Jun 13 '23 12:06 ShadowSlimey

same error tried multiple machines on vast.ai they all give this error

The model weights are not tied. Please use the `tie_weights` method before using the `infer_auto_device` function.
Loading checkpoint shards:   0%|                                                                                           | 0/2 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/root/qlora/qlora.py", line 807, in <module>
    train()
  File "/root/qlora/qlora.py", line 643, in train
    model = get_accelerate_model(args, checkpoint_dir)
  File "/root/qlora/qlora.py", line 280, in get_accelerate_model
    model = AutoModelForCausalLM.from_pretrained(
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 484, in from_pretrained
    return model_class.from_pretrained(
  File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 2881, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3227, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 728, in _load_state_dict_into_meta_model
    set_module_quantized_tensor_to_device(
  File "/usr/local/lib/python3.10/dist-packages/transformers/utils/bitsandbytes.py", line 101, in set_module_quantized_tensor_to_device
    new_value = value.to(device)
  File "/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py", line 247, in _lazy_init
    torch._C._cuda_init()
RuntimeError: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 804: forward compatibility was attempted on non supported HW

Try this: from transformers import AutoModelForCausalLM, AutoTokenizer

Load the model and tokenizer

model_name = "your_model_name" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name)

Tie the weights

model.tie_weights()

Now you can use the model for inference without encountering the error

This might be a dumb question, but what file are you saying to add this code into?

EDIT Oops, I was just googling how to fix this whole tie_weights from a textgenwebui standpoint and am now seeing that OP was working from a google doc rather than a local machine. Never mind!

TTTrouble avatar Jun 15 '23 19:06 TTTrouble

If you

same error tried multiple machines on vast.ai they all give this error

The model weights are not tied. Please use the `tie_weights` method before using the `infer_auto_device` function.
Loading checkpoint shards:   0%|                                                                                           | 0/2 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/root/qlora/qlora.py", line 807, in <module>
    train()
  File "/root/qlora/qlora.py", line 643, in train
    model = get_accelerate_model(args, checkpoint_dir)
  File "/root/qlora/qlora.py", line 280, in get_accelerate_model
    model = AutoModelForCausalLM.from_pretrained(
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 484, in from_pretrained
    return model_class.from_pretrained(
  File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 2881, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3227, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 728, in _load_state_dict_into_meta_model
    set_module_quantized_tensor_to_device(
  File "/usr/local/lib/python3.10/dist-packages/transformers/utils/bitsandbytes.py", line 101, in set_module_quantized_tensor_to_device
    new_value = value.to(device)
  File "/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py", line 247, in _lazy_init
    torch._C._cuda_init()
RuntimeError: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 804: forward compatibility was attempted on non supported HW

Try this: from transformers import AutoModelForCausalLM, AutoTokenizer

Load the model and tokenizer

model_name = "your_model_name" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name)

Tie the weights

model.tie_weights()

Now you can use the model for inference without encountering the error

This might be a dumb question, but what file are you saying to add this code into?

EDIT Oops, I was just googling how to fix this whole tie_weights from a textgenwebui standpoint and am now seeing that OP was working from a google doc rather than a local machine. Never mind!

What appears to work for me in oobabooga is Available Extensions: API, gallery Boolean command-line flags: pin_weight

Then when clicking on model (between parameters and training) you select your model (I use Pygmalion 13b), then make sure: wbits: 4 groupsize: 128 model_type: LLAMA autodevices: ON gptq-for-llama: ON

Save settings and reload model

ShadowSlimey avatar Jun 16 '23 04:06 ShadowSlimey

If you

same error tried multiple machines on vast.ai they all give this error

The model weights are not tied. Please use the `tie_weights` method before using the `infer_auto_device` function.
Loading checkpoint shards:   0%|                                                                                           | 0/2 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/root/qlora/qlora.py", line 807, in <module>
    train()
  File "/root/qlora/qlora.py", line 643, in train
    model = get_accelerate_model(args, checkpoint_dir)
  File "/root/qlora/qlora.py", line 280, in get_accelerate_model
    model = AutoModelForCausalLM.from_pretrained(
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 484, in from_pretrained
    return model_class.from_pretrained(
  File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 2881, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3227, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 728, in _load_state_dict_into_meta_model
    set_module_quantized_tensor_to_device(
  File "/usr/local/lib/python3.10/dist-packages/transformers/utils/bitsandbytes.py", line 101, in set_module_quantized_tensor_to_device
    new_value = value.to(device)
  File "/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py", line 247, in _lazy_init
    torch._C._cuda_init()
RuntimeError: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 804: forward compatibility was attempted on non supported HW

Try this: from transformers import AutoModelForCausalLM, AutoTokenizer

Load the model and tokenizer

model_name = "your_model_name" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name)

Tie the weights

model.tie_weights()

Now you can use the model for inference without encountering the error

This might be a dumb question, but what file are you saying to add this code into? EDIT Oops, I was just googling how to fix this whole tie_weights from a textgenwebui standpoint and am now seeing that OP was working from a google doc rather than a local machine. Never mind!

What appears to work for me in oobabooga is Available Extensions: API, gallery Boolean command-line flags: pin_weight

Then when clicking on model (between parameters and training) you select your model (I use Pygmalion 13b), then make sure: wbits: 4 groupsize: 128 model_type: LLAMA autodevices: ON gptq-for-llama: ON

Save settings and reload model

Thank you, this ended up working for me as well! I was confused because it was happening with older and newer models I'd downloaded and they definitely did have a config file, so there might be something I did that interferes with them loading properly. I'm sort of blindly stumbling through this like so many others and I appreciate your help!

TTTrouble avatar Jun 16 '23 10:06 TTTrouble

The above solution did not work for me using alternate models.

All of the previous models I was using no longer load at all.

I have come to the conclusion oobabooga is presently broken.

Webslug avatar Jun 20 '23 13:06 Webslug

The above solution did not work for me using alternate models.

All of the previous models I was using no longer load at all.

I have come to the conclusion oobabooga is presently broken.

Reinstall oobabooga, maybe there is an update

ShadowSlimey avatar Jun 20 '23 15:06 ShadowSlimey

I started running into this problem today with normal huggingface models (not working with oobabooga).

Downgrading accelerate from version 0.20.3 to 0.19.0 fixed it for me.

alexgshaw avatar Jul 11 '23 02:07 alexgshaw

I started running into this problem today with normal huggingface models (not working with oobabooga).

Downgrading accelerate from version 0.20.3 to 0.19.0 fixed it for me.

I tried this. It works without warning.

rollingdeep avatar Jul 12 '23 02:07 rollingdeep

model_name = "your_model_name" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name)

apologies, but where does this code need to be injected? in server.py?

muunkky avatar Jul 13 '23 08:07 muunkky